Introduction

The Belt and Road Initiative (BRI) is a collaboration project launched by the Chinese Government to connect more than 65 countries all over the word by developing infrastructures, facilities and support among the involved Countries and to encourage innovation in less developed Countries (Cui et al. 2017; Liu and Dunford, 2016).

The Silk Road Disaster Risk Reduction (SiDRR) project (Lei et al. 2018) is one of the prioritized sub-projects of the BRI . The purpose of the SiDRR is to carry out a long-term research project dealing with natural hazard assessment and risk mitigation in the Belt and Road Countries. A group of experts, with the role of scientific coordination of the activities carried out by the involved Countries, as well as the dissemination of the results, has been created. The expected outcomes of the research activities of the group are the assessment of geo-hydrological hazards in the Belt and Road Countries and the definition of risk mitigation measures.

Risk, hazard and susceptibility zoning are three complementary approaches to support land planning. They implicate a decreasing complexity in method and in data types, respectively. Considering the small scale of the SiDRR analysis and the available data, a landslide susceptibility zoning has been proposed to build a map which should give a general overview of the landslide-prone areas in the Belt and Road Countries. The goal is to individuate the most susceptible areas where to focus further and more detailed assessments. “In mathematical form, landslide susceptibility, is the probability of spatial occurrence of known slope failures, given a set of geoenvironmental conditions” (Guzzetti et al. 2006). Otherwise, landslide susceptibility can be defined as the spatial component of the hazard (Reichenbach et al. 2018) which, in turn, is the combination of the frequency of landslide occurrence and the susceptibility map (Fell et al. 2008). Thus, landslide susceptibility can be considered a fundamental part of the process to reach landslide hazard and risk assessment. At the same time, it can be used in land use planning for large areas or in analyses characterized by scarcity of data (Corominas et al. 2014).

This work will support the development of landslide hazard prevention and mitigation measures proposing a multi-scale approach for landslide susceptibility zoning and discussing its first application in a test area.

The used scale classification has been derived by Glade and Crozier (2012) and introduced by Soeters and Van Westen (1996). One class has been added to the original ranges: large scales (> 1:10,000), medium scales (1:15,000–1:100,000), regional scales (1:125,000–1:500,000), national scales (1:750,000–1:2,000,000) and continental scales (< 1:5,000,000).

An approach to landslide susceptibility assessment based on the analysis of the state-of-the-art is presented, where the terms “large area” and “small scale” are used as synonyms and they refer to continental scale. To show the feasibility and the robustness of the suggested approach, the case study of south-Asia has been analyzed. The results have been evaluated and validated by means of ROC analysis.

State of the art

Landsliding is a complex process driven by several possible predisposing and triggering factors. Numerous causative factors (geo-environmental factors) may predispose slope to failure, such as: geology, topography, tectonics, land cover and use, hydrology, and others (Eckelmann et al. 2006). On the other hand, the processes which trigger a landslide can be different and include: intense or prolonged rainfall, earthquakes, rapid snow melting, volcanic activity, human actions and others (Guzzetti et al. 2012).

The goal of landslide susceptibility zoning is to analyze the probability of landslide occurrence under the influence of a combination of factors, not including landslide frequency. Therefore, the temporal factor is not taken into consideration (Chacón et al. 2006).

Many different methods have been proposed for landslide susceptibility zoning in the scientific literature. The choice of a susceptibility mapping method significantly influences the prediction capacity of the analysis. According to Corominas et al. (2014), the methods can be categorized as qualitative (knowledge-driven methods) and quantitative (data-driven methods). The former take advantage of the theoretical and empirical knowledge of the researchers to make scientific analysis and judgment (Axing et al. 2010; Ayalew et al. 2004; Barredo et al. 2000; Günther et al. 2014; Saaty 1990). The latter recreates the relation between landslides and their controlling factors using a mathematical model: physically (Chung and Fabbri 2003; Goetz et al. 2011; Gorsevski et al. 2006) or statistics-based (Agterberg et al. 1989; Bonham-Carter et al. 1988; Bui et al. 2016; Carrara et al. 2008; Catani et al. 2013; Chen et al. 2016, 2017; Constantin et al. 2011; Eeckhaut et al. 2012; Gorsevski et al. 2000; Pham et al. 2016; Yao et al. 2008).

Reichenbach et al. (2018) pointed out that Logistic Regression is one of the most diffuse statistics-based models for landslide susceptibility on both large and small scales. Concerning landslide susceptibility assessment at continental, or similar scale, several different methodologies have been used so far. Most of them are knowledge-driven methods and statistic-based methods. In addition, physically based methods are excluded in small scale analyses, because they require a detailed knowledge of the landslide dynamics, which is not feasible for large areas.

The scarcity of data, which is very common in small-scale analyses, may be a driven factor in model selection. In particular, the lack of landslide inventories may affect the robustness and quality of the results. However, as stated by Hong et al. (2007) this is not always true: “more information does not necessarily lead to better results, depending on the quality of the data”.

Due to the lack of a global landslide data set at that time, Hong et al. (2007) have produced a global landslide susceptibility map without landslide inventories. They have weighted the causative factors on the base of reference studies and information available combined through a linear combination method. Following a similar idea, Eeckhaut et al. (2012) proposed a statistical model application with limited landslide inventory data. They evaluated the landslide susceptibility over Europe with the Logistic Regression model.

The recent development of global landslide inventories has supported new analyses at global scale (e.g., Kirschbaum and Stanley 2018; Stanley and Kirschbaum, 2017) which have produced a global landslide susceptibility map for rainfall-triggered landslides based on the fuzzy overlay method. This approach combines landslide inventories with expert opinions to develop a heuristic model.

At the continental scale, Günther et al. (2014) and then Wilde et al. (2018) presented the landslide susceptibility maps of Europe, named ELSUSv1 and ELSUSv2, respectively. Despite that, they had numerous national landslide inventories heterogeneously distributed, they have proposed a qualitative model (Spatial Multi Criteria Evaluation) using Analytical Hierarchy Process.

A different approach at continental scale has been used by (Broeckx et al. 2018) who analyzed the landslide susceptibility all over Africa based on a well-distributed inventory applying Logistic Regression model.

As a concern, the study area here considered (South-Asia) numerous landslide susceptibility maps at national or smaller scale have been produced in recent years. Some of those, based on the Neural Network method with about 1300 landslides all over China are reported in Liu et al. (2013). Recently, Saponaro et al. (2015) have covered Uzbekistan, Tajikistan and Kyrgistan, using the Weight of Evidence method.

In the context of the European Union’s Thematic Strategy for Soil Protection (EC, 2006), the Soil Information Working Group (SIWG) of the European Soil Bureau Network (ESBN) has promoted a project for the identification of landslide hazard priority areas. The European Landslide Expert Group (Günther et al. 2013a) has then put forward a multi-Tier susceptibility analysis in the Guidelines for Mapping Areas at Risk of Landslides in Europe (Hervás 2007). Taking inspiration from the latter, and considering, the extension of the study area and the geomorphological, geological, cultural, scientific heterogeneity of the context, the Weight of Evidence (WoE) method has been selected. Therefore, the application of the selected method and the Tier 1 approach at the south-Asia region is presented here.

Case study

The Belt and Road Initiative involves 3 continents (63% of the world), and more than 65 countries (Cui et al. 2017). The study area selected includes a relevant part of the Belt and Road Countries (Fig. 1), namely: China, Pakistan, India, Tajikistan, Bangladesh, Nepal, Afghanistan, Bhutan, Myanmar, Cambodia, Kyrgyzstan, Laos, Thailand, Viet Nam. The study area comprises a high density of landslides mainly triggered by heavy rainfall (more than 1600 mm/y) (Fig. 7) as well as some of the most disastrous earthquakes that have recently occurred in the world (i.e., Nepal 2015 and Wenchuan 2008). The diversity of climate, topography, geological features and land cover result in an extremely complex environment on which to assess landslide susceptibility.

Fig. 1
figure 1

Study area of the south-Asia

Materials and methods

The Tiers-based workflow produced consequential susceptibility zoning of the same study area by growing scales (Fig. 2). The smallest scale provides an overview of the object of study and it delineates the priorities, i.e., the most susceptible regions. Therefore, Tier 1 assessment exploits low-resolution data and incomplete spatial information. With the increase of the scale of analysis, the Tier 2 approach is intended to detail the landslide susceptibility analysis conducted by the Tier 1 (Günther et al. 2013b). The scale of Tiers cannot be defined a priori, since it depends on the available data resolution and the spatial extent of the study area. This means that the results should reflect, at least, the minimum resolution of the data input.

Fig. 2
figure 2

Workflow of the Tiers-based approach

The landslide prediction model represents the core of the entire analysis. The model proposed here is: (i) temporally/geographically reproducible; (ii) simple and thus clear to the people involved; (iii) as realistic as possible. A qualitative assessment for Tier 1 level has been proposed by Günther et al. (2007). It was based on the expertise of the researchers responsible for the analysis, thus the reproducibility depends on the investigator. To make the results reproducible in time and in all areas, the assessment technique should be quantitative and as objective as possible. The use of physical predictors and of the validation procedure define the physical relevance of the model in accordance with the geological and geomorphological features of the study area. Therefore, to create a reproducible, simple, realistic landslide susceptibility map not affected by subjectivity, misunderstanding and abstraction, a limited number of causative factors related to all types of landslides and a quantitative susceptibility modeling technique, suitable to the specific Tier, have been assumed.

The landslide susceptibility concept is based on the simple principle that landsliding will occur more frequently in the most susceptible areas characterized by similar geo-environmental factors which predispose towards slope failures. A landslide inventory and selected causative factors are the preliminary requirements for susceptibility analysis (Van Westen et al. 2008), especially if a statistic-based correlation analysis is applied (Bui et al. 2016) in accordance with the scale of mapping, the required usage, and the quality of the data available (Fell et al. 2008).

A landslide inventory is an essential part of the input dataset in landslide susceptibility mapping. It generally records the location, the date of occurrence and type of mass movements (Margottini et al. 2013). In this analysis, three different alternatives have been taken into consideration: (i) aggregation and homogenization of all local landslide datasets available in the Countries involved in the project; (ii) development of a new landslide dataset; and (iii) collection of global landslide datasets. The latter solution has been selected due to the complexity of the aggregation processes and the lack of information. Indeed, the regional slope failure database is sometimes not complete or, more often, it is absent completely. The few data available locally are often limited to recent years and, therefore, not representative of the instability condition as it is.

The NASA-COOLR dataset (Juang et al. 2019) has been selected for the purpose of this work. It is an open database for landslide events launched in 2018 which collects different inventories from different sources: Landslide Reporter Catalog (LRC) (Juang et al. 2019), NASA Global Landslide Catalog (GLC) (Kirschbaum et al. 2010, 2015) and collated inventories from external local sources. The LRC includes landslide reports by citizen scientists through the Landslide Reporter and checked by NASA. The GLC is a global inventory of rainfall-triggered landslides compiled by NASA since 2007 and is based on online media reports, disaster databases, scientific reports and more (Kirschbaum et al. 2015). The rest of the landslides are added into the COOLR by the LRC and other sources such as the SERVIR-Mekong team for Myanmar landslides (SMMML). The team collected landslides based on Google Earth imagery (Juang et al. 2019).

The landslide inventory of COOLR used for the analysis was downloaded in June 2020. Each point in the inventory has been assigned a radius of confidence between 1 and 75 km. In accordance with the spatial resolution of the analysis the landslide locations with 1 km of accuracy have been selected (1549 landslides). The landslide attributes are shown in Fig. 3 and Table 1. The heterogeneity of data sources implicates the variability of information. As a consequence, some information such as the exact location of the landslides have been collected but others remain unknown (e.g., landslide category and trigger).

Fig. 3
figure 3

Landslide frequencies classified by attributes of 1 km location accuracy dataset (1549). The attributes include ‘unknown’, ‘others’ and empty records which are not reported into the graphs. They are 1142 of ‘landslide category’, 1053 of ‘landslide size’, 1109 of ‘landslide trigger’, 1040 of ‘date of the event’ and 880 of ‘landslide setting’

Table 1 Number of COOLR landslides per country with location accuracy ‘exact’ or equal to 1 km

Considering the limited information available on each event, a cross-check has been conducted to evaluate the reliability of the inventory for the purposes of this analysis. DEM and satellite images, along with a number of pictures, available for 118 landslides have been analyzed. The 250 m DEM has been downloaded from CIAT website (Reuter et al. 2007) which is the result of a resampling process from the 30 m SRTM data.

The sample is not properly representative of the inventory, but, it allows some possible incompatibilities with the goal of the analysis to be highlighted. As a result, some features have been removed from the inventory. For example: 3 events are classified as snow avalanches, whereas 47 landslides appear to be related with human alterations of the natural landscape (mining, engineered slopes and retaining walls) which are strictly site-specific. In particular, for 6 of the latter a photo link is available. They show that slope instability events occurred during mining activities and construction works which are different from slope cutting, these are probably triggered by anthropic activities. It could be argued that all the 47 landslides have been probably triggered by antrophic activities as well and caused by the same conditions. Therefore, they have been deleted from the inventory. Then, given a radius of confidence of approximately 1 km around each feature (9 × 9 cells of 250 m-side pixel), some features have been removed, since the slope of the terrain is lower than 3°, thus, they may be considered excavation collapses (38 events). At the end of this cross-check activity, 1461 landslides have been selected for the susceptibility zoning (Fig. 4).

Fig. 4
figure 4

Landslide dataset used for the analysis and the features removed after the cross-check

The analysis focuses on the development of future scenarios based on the prediction of landslides spatial distribution. It reproduces spatially the combination of factors responsible for previous events.

The selection of the causative factors for the multi-scale analysis depends on the Tier. In the context of the Tier 1 analysis, the considered causative factors are: slope degree, plan curvature, profile curvature, relative relief, lithology, land cover, Peak Ground Acceleration (PGA) and annual rainfall (Table 2).

Table 2 Causative factors

As stated by (Fell et al. 2008), “areas with similar topography, geology and geomorphology as the areas which have experienced landsliding in the past are also likely to experience landsliding in the future”. To identify the causes of past landslides and predict future scenarios, the statistical approach proposed in this work requires the classification of the causative factors. The classification significantly affects the prediction skill of the analysis. Therefore, classifications previously proposed in papers and technical reports have been assigned to the causative factors considered here. The diagram in Fig. 5 shows how the causative factors have been pre-processed.

Fig. 5
figure 5

Pre-processing of the causative factors

The morphological factors for Tier 1 assessment have been classified according to the slope angle, plan curvature (curvature tangent to the contour line), profile curvature (curvature tangent to the slope line) and relative relief (the maximum range of elevation in a neighborhood of 1 km of radius).

The morphological factors have been derived from the 30 m Shuttle Radar Topography Mission (SRTM) DEM (Farr et al. 2007; Florinsky et al. 2019). The morphological factors have been calculated in Google Earth Engine (GEE) (Gorelick et al. 2017) using Terrain Analysis in Google Earth Engine (TAGEE) a GEE package for terrain analysis (Safanelli et al. 2020). GEE allowed us to calculate slope angle, plan and profile curvature and relative relief with a pixel size of 30 m then resampled into a square grid of 3 × 3 km by average calculation. To reduce the computational cost and balance the amount of stable and unstable cells of the dataset, the 30 m cells with slope degree lower than 8° have been masked for all the causative factors. This value has been selected considering the average slope of the debris flow fans which represent the steepest terrain in which landslides are not expected but accumulation landforms. Therefore, the valley bottoms along with all the depositional forms that couldn’t be affected by instability processes in the mountain areas have been excluded from the analysis. The relief has been calculated as the range between the maximum and the minimum elevation in a buffer radius of 1 km around each 30 m pixel.

The slope angle (terrain gradient) classification reflects the classes used for the ELSUSv1 (Günther et al. 2014): 0°, 1–3°, 4–6°, 7–10°, 11–15°, 16–20°, 21–30°, > 30°. The curvatures have been classified in quartiles. Therefore, profile curvature has been classified in: < − 2.6 10–4, -2.6 10–4 to − 1.4 10–4, − 1.4 10–4 to − 5.4 10–5, − 5.4 10–5 to 4.2 10–5, > 4.2 10–5. Plan curvature has been classified in: < 1.3 10–7, 1.3 10–7–1.2 10–4, 1.2 10–4–2.2 10–4, 2.2 10–4–3.8 10–4, > 3.8 10–4. Relative relief has been divided in deciles: 0–9 m, 9–18 m, 18–34 m, 34–65 m, 65–120 m, 120–194 m, 194–288 m, 288–424 m, 424–625 m, > 625 m (Fig. 6).

Fig. 6
figure 6

Causative factors selected for the Tier 1 landslide susceptibility: a slope, b profile curvature c plan curvature and d relief

The geological factor has been proposed by Hartmann and Moosdorf (2012). They have mapped the lithology of the globe into 16 classes, all of them present in South-Asia. They have been grouped into 7 classes: (1) ice, glaciers and water bodies; (2) siliciclastic sedimentary rocks, mixed sedimentary rocks, carbonate sedimentary rocks and pyroclastics; (3) mixed sedimentary rocks; (4) evaporites; (5) acid volcanic rocks, intermediate volcanic rocks, basic volcanic rocks; (6) acid plutonic rocks, intermediate plutonic rocks, basic plutonic rocks; and (7) metamorphic rocks (Fig. 7e).

Fig. 7
figure 7

Causative factors selected for the Tier 1 landslide susceptibility: e lithology, f land cover, g PGA and h. precipitation

In regards to the land cover classification, the ESA GlobCover 2009 Project has classified the land cover information into 22 classes (Bontemps et al. 2011) which have been grouped into 8 categories (Table 3) according to the United Nations (FAO) Land Cover Classification System (LCCS) (Di Gregorio 2016). Therefore, the 8 LCCS classes have been suggested for Tier 1 land cover factor subdivision (Table 3) (Fig. 7f).

Table 3 Land cover classification conversion from GlobCover classification to LCCS (Bontemps et al. 2011; Di Gregorio 2016)

Since most of the landslides in Asia are mainly triggered by rainfall and earthquakes, two factors have been included: annual rainfall and PGA.

The annual rainfall factor has been calculated from the annual sum of the daily precipitation measured by the Multi-satellitE Retrievals for Global Precipitation Measure (IMERG) (Huffman et al. 2019) with a cell size of 10 × 10 km. The final result is the average of the annual precipitation over 11 years (2009–2019) resampled to a 3 km square grid using the bilinear resampling method of SAGA GIS. The map has been classified into deciles: 0–111 mm/y, 111–217 mm/y, 217–339 mm/y, 339–478 mm/y, 478–605 mm/y, 605–769 mm/y, 769–1003 mm/y, 1003–1340 mm/y, 1340–1731 mm/y, > 1731 mm/y (Fig. 7 h).

The PGA map has been developed from a collaboration among the Columbia University Center for Hazards and Risk Research (CHRR) and Columbia University Center for International Earth Science Information Network (CIESIN) using Global Seismic Hazard Program (GSHAP) data. It includes areas with a probability to exceed at least 10% the PGA in a time span of 50 years (> 2 m/s2). The PGA have been classified into deciles from the 1th to the 10th (CHRR-Columbia University, CIESIN-Columbia University 2005, Dilley et al. 2005). The zero class has been added (Fig. 7g).

The details about the data type, source and quality are reported in Table 4. All the data used for the analysis are freely available from different global databases (Table 4) (Figs. 6, 7).

Table 4 Data freely available from published databases for Tier 1 application

The WoE technique has been proposed for the mathematical evaluation of the Landslide Susceptibility Index (LSI). The WoE model introduced by Agterberg et al. (1989) and then by Bonham-Carter et al. (1988) is a bivariate statistical analysis, which compares dependent (landslide inventory) and independent variables (causative factors), it is used to evaluate landslide susceptibility (Pasuto and Tagliavini 2007). It assigns two weights (W+, W) to the classes of each causative factor. The weights W+ and W mean that the presence of the factor is favorable to slope instability and the presence of the factor is favorable to slope stability, respectively. The general formulations of Agterberg et al. (1989) are the following:

$$W^{ + } = \ln \frac{{P(B\left| {D)} \right.}}{{P(B\left| {D_{1} )} \right.}}$$
(1)
$$W^{ - } = \ln \frac{{P(B_{1} \left| {D)} \right.}}{{P(B_{1} \left| {D_{1} )} \right.}}$$
(2)
$$W_{f} = W^{ + } - W^{ - }$$
(3)

where P is the probability, B is the presence of a potential landslide causative factor, B1 is the absence of a potential landslide causative factor, D is the presence of a landslide and D1 represents the absence of a landslide. Wf is called weight contrast: the magnitude of the contrast reflects the overall spatial relation between causative factors and landslides (Dahal et al. 2008).

The landslide susceptibility is mapped by the sum of ith weights contrast of the classified maps for the n causative factors:

$$LSI = \sum\limits_{i = 1}^{n} {W_{fi} }$$
(4)

The result is the Landslide Susceptibility Index (LSI). Once standardized, it represents a measure of the landslide likelihood of occurrence or a measure of the potential spatial distribution of future landslides.

To evaluate the ability of the susceptibility model to predict the spatial distribution of the landslides and to evaluate the robustness of the model fitting capacity, the Area Under the Curve (AUC), calculated from the Receiving Operating Characteristic (ROC) curve (Chung and Fabbri 2003; Fawcett 2006), has been proposed. The ROC curve explores the relation between the True Positive Rate and the False Positive Rate by consecutive cutoffs of the LSI. Formally, each map-unit of the susceptibility map is labeled with True Positive (tp), False Positive (fp), True Negative (tn) and False Negative (fn) tags. In the susceptibility map, a map-unit is True or False according to the presence or the absence of landslides, respectively. Moreover, the unit is considered Positive or Negative if the relative susceptibility value is higher (stable unit) or lower (unstable unit) than the cutoff. The ROC curves are graphed coupling tprate (y-axis) and fprate (x-axis):

$${tp_{{{\text{rate}}}} = \frac{tp}{{tp + fn}}}$$
(5)
$${fp_{{{\text{rate}}}} = \frac{fp}{{fp + tn}}}$$
(6)

The area underlying the ROC curve (AUC) can be used as a metric to assess the overall quality of a model: the larger the area, the better the performance of the model over the whole range of possible cutoffs. Therefore, if the AUC is equal to 1 it means that the results are perfect, whereas if it is equal to 0.5 the scenario predicted is unlikely.

Considering the accuracy of the landslide inventory, the analysis has been conducted with the pixel size of 3 km x 3 km. Thus, the causative factors have been resampled to the same size before the statistical analysis. The categorical factors have been resampled taking into account the predominant class, while for the continuous factors, the average of the included values have been calculated. The landslide inventory has been divided randomly in two datasets: 70% and 30%, to train and validate the model. The software used for the analysis are QGIS, SAGA-GIS. The simulation based on the WoE and the validation have been processed with the SZ-plugin (Titti and Sarretta 2020) developed for QGIS. The ROC analysis has been carried out using the Scikit-learn module (Pedregosa et al. 2011).

Results and discussion

The landslide susceptibility map, resulting from the analysis, is shown in Fig. 8a and the class weights are reported in Table 5.

Fig. 8
figure 8

a 3 × 3 km landslide susceptibility map of south-Asia. b Spatial distribution of the LSI. c Prediction rate curve and Success rate curve of the landslide susceptibility map

Table 5 Positive weights, negative weights and contrast weights of the classified causative factors resulting from the WoE analysis of the study area

The WoE is a bivariate approach which evaluates the single predisposing factor in relation with the dependent variable, Table 5 reports the W+, W, Wf values and the percentage of landslide cells and area of each class factor. The weight contrasts of slope factor show an almost constant increase in instability from 0° to > 30°, with a peak around 21°−30° and a negative value between slope angle of 7° and 15°. Noticeably, the mask until 7° of slope adopted for the factors has excluded the first three classes of the slope factor, since their cell value is equal to the average of 30 m slope. Even though the pixel size of the analysis cannot perfectly describe the land surface morphology, the trend of the Wf is realistic. Moreover, the highest number of landslides is present in the class 21°−30°, which includes the 52% of the landslide cells, revealing the highest Wf of the slope factor. The slope factor also includes the most stable class of all factors which is the class 7°–10° with Wf equal to − 6.

Plan and profile curvatures represent the convexity and concavity of the surface tangent to the contour line and to the slope line, respectively. The former is related to the lateral flow convergence or divergence, while the second to the acceleration and deceleration of a flow along the gravity direction. Based on the Wf values of these factors, there is not a relevant difference between them. The landslides percentage per class and the area percentage are very balanced. (Table 5).

The relative relief presents an increasing trend from 18 m to > 625 m. Since the classes represent deciles, the area of each class is almost 10% of the total. Therefore, the trend of the Wf is dependent on the landslide included. The most unstable classes are the 424–625 m and > 625 which include the 58% of the total landslide cells (Table 5).

As regards the lithology factor, the results confirm that Wf parameters must be analyzed in relation to the other classes and with reference to the specific study area. Indeed, depending on the geological context, some unconsolidated lithologies might have lower strengths compared to metamorphic ones. Here, “unconsolidated sediment” is the most stable class, while “metamorphic rocks” is the least stable. In particular, the stability of the former comes from the high extension of the area (31% of the total area), although it includes the 10% of the landslide cells, while the instability of the latter is due to the balance between the number of landslides included (15% of the total landslide cells) and the area covered by the class (7% of the total). The second highest Wf is the “sedimentary rocks” which covers about 42% of the total area and the 60% of the landslide cells (Table 5).

Regarding the land cover, it contains one of the most stable classes and the most unstable class of all considered factors. Indeed, the land covered by “shrub” reflects a Wf value equal to 2.35, while “bare areas” display the lowest value, equal to −3.77 (Table 5).

The Wf values of the PGA and precipitation classes reveal that the landslides included in the inventory are mainly triggered by precipitation (Fig. 3). The PGA Wf are variable between − 1.57 and 2.01 without a precise trend. The precipitation classes have an increasing trend similar to relief and slope. The higher the annual precipitation, the higher the instability up to a Wf value of 2.55.

The standardized LSI (0–1) has been divided into 5 classes (Fig. 9) to fit the success curve as best as possible: 0–0.54 “very low”, 0.54–0.64 “low”, 0.64–0.74 “moderate”, 0.74–0.80 “high”, 0.80–1 “very high”. Statistically, in the 5-classes susceptibility map (Fig. 9a), the highly susceptible terrain covers 5% (Fig. 9b) of the total mapped area, more than the 93% of that is steeper than 15° and the 92% has a relief higher than 194 m. On the contrary, the 68% of the “very low” susceptibility areas, which cover the 68% (Fig. 9b) of the study area, respectively, has a relief lower than 120 m (87% lower than 288 m).

Fig. 9
figure 9

a Classified 3 × 3 km landslide susceptibility map of south-Asia. b Spatial distribution of the LSI and the area covered by the relative classes. c Prediction rate curve and Success rate curve of the landslide susceptibility map

The weights W+ and W are calculated from the relation between the number of landslides included or excluded into the class factor and the size area of the class factor. The result of the difference between W+ and W plays a significant role to determine if the specific class is favorable to the instability of the slope. In particular, the areas represented by negative Wf values can be considered stable and the areas represented by positive Wf values, unstable with respect to a specific predisposing factor. Therefore, the LSI resulting from the sum of the Wf may range between negative and positive values which allow for evaluating the stability or instability of the area.

To select the most susceptible area to analyze in detail in the Tier 2 assessment, a susceptibility class has been assigned to each administration unit of the study area. Different levels of administration units are available in GADM website.Footnote 1 Since different levels are available for different countries, a specific level has been assigned to each country to homogenize the dimension of the administration units all over the study area. Taking inspiration from Arup (2020) the relative landslide susceptibility of the single administrative unit has been evaluated as the 80th percentile of the LSI pixel-based map (Fig. 8) and then classified from very low to very high to optimize the ROC curve weighted over the extension area of the administrative units area. The result is shown in Fig. 10.

Fig. 10
figure 10

Landslide susceptibility map based on administrative units

The prediction performance and the success of the Tier 1 analysis have been evaluated by the ROC curves (Fawcett, 2006), which are reported in Fig. 9c. The curves have been plotted using the validation dataset and the training data set, respectively. The resulting AUC is equal to 0.91 for the prediction curve and equal to 0.90 for the success curve. Overall, the model applied to the selected study area has demonstrated good reliability to evaluate potential instability areas.

An additional way to evaluate the prediction capacity of the model is presented in Fig. 11. It compares the ROC curve of the single causative factor with the curve of the landslide susceptibility map. The significant differences among the ROC curve of the susceptibility map (AUC = 0.91) and the curve of the slope (AUC = 0.77), relief (AUC = 0.77), precipitation (AUC = 0.90) along with the lower values of the AUC of the other causative factors, confirm the goodness of the prediction performance based on the combination of multiple factors (slope, curvatures, relief, land cover, lithology, PGA and precipitation) in comparison with the use of single factor alone (Günther et al. 2013b; Remondo et al. 2003). The factors that appear to play a major role in predisposing slope instability phenomena with respect to the landslide inventory used are slope, relief and precipitation.

Fig. 11
figure 11

ROC curves and relative prediction performances (AUC) of the causative factors and of the susceptible map

Conclusions

The work is aimed to map the landslide susceptibility in the Belt and Road Countries. In this framework, the landslide susceptibility zoning through the multi-Tier approach has been carried out.

The landslide susceptibility map of south-Asia has been modeled using a quantitative, statistical method. Eight independent variables, i.e., Slope, Plan curvature, Profile curvature, Relative relief, Lithology, Land cover, PGA, Precipitation, have been classified and then weighted by the WoE. The analysis has been based on the NASA-COOLR landslides inventory. It is a global landslides catalog that collects data from online media reports, disaster databases, scientific reports, citizen reports, and others. All the data and software used in this work are open and open-source.

The result is a 5 classes landslide susceptibility map. The ability of the susceptibility model to predict the spatial distribution of the landslides and the goodness of the model fitting have been evaluated by the comparison of the ROC curves calculated from the validating and training datasets. The prediction and success performance are 0.91 and 0.90, respectively. Among the causal factors slope, relief and precipitation play a major role. The administrative units, displaying moderate to very high susceptibility class, have been selected for further analysis to be carried out at a national scale (Tier 2).