1 Introduction

Natural disasters have frequently occurred in recent decades because of extreme weather and global warming, i.e., floods, earthquakes, and landslides. Landslides are one example of a severe type of major geological disaster (Guzzetti et al. 1999; Huang et al. 2020; Wang et al. 2021). The effects of the landslide phenomenon include property loss, loss of life, and damage to infrastructure that can be severe enough to affect the stability of developed societies (Lee et al. 2012). The occurrence of landslides is governed by many factors, such as earthquakes, heavy rainfall, and volcanic activity (Guzzetti et al. 1999). Notably, the increasing economic development in urban areas, such as changes in vegetation activity and the construction of roads or buildings, can cause rapid changes at landslide-prone locations. In Indonesia, in the Kepahiang Regency, Bengkulu Province, i.e., the study area, landslides are frequently triggered by earthquakes and heavy rainfall (Hadi and Siswanto 2016; BMKG 2022). The Kepahiang Regency is one of the most earthquake-prone areas along the Sumatran fault, with frequent tectonic and seismic activities (Sieh and Natawidjaja 2000; BMKG 2022). Therefore, motivated by these conditions, it is necessary to assess the landslide susceptibility at certain sites in the Kepahiang Regency. Previous landslide investigations in this study area mainly focused on documenting simple geological hazards and particular events (Hadi and Siswanto 2016; Sukisno and Muin 2012). In contrast, in this study, we developed a multivariate method for determining the landslide susceptibility in a largely overlooked area.

The multivariate statistical method is a quantitative method and has been used as a landslide assessment tool or technique (Rotigliano et al. 2011). Quantitative models are commonly used to determine the landslide susceptibility. However, quantitative methods depend on simultaneous observations and statistical models that require a large number of variables to produce landslide susceptibility maps (Kavzoglu et al. 2014). The quantitative model is also an essential tool for identifying the landslide potential in areas with high rates of earthquake-induced landslides (Feizizadeh and Blaschke 2013). In particular, studies combining a geographic information system (GIS) and a multicriteria decision approach (MCDA) have been conducted by Goumrasa et al. (2021) and Feizizadeh et al. (2014). They applied the AHP and SAW methods, which are effective techniques for determining the landslide susceptibility. El Jazouli et al. (2019) established a landslide susceptibility map of Wenchuan, China, using the MCDA based on characterization data and rainfall information. Regarding the Kepahiang Regency area, Hadi and Siswanto (2016) established empirical relationships between catastrophe theoretical aspects and slope stability. They emphasized the high value of the critical importance (Cr) index and demonstrated the relationship between landslide occurrence and the critical threshold of the slope stability. Sukisno and Muin (2012) proposed a landslide susceptibility mapping method based on simple geological hazard research and field techniques to generate landslide maps of the Kepahiang Regency area. In this study area, despite the existing landslide susceptibility prediction studies, no prior research has employed the multi-variance statistical quantitative method for predicting potential landslides in the Kepahiang Regency, Indonesia.

Therefore, the main novelty of this study is the assembly of various multivariate statistical methods, i.e., the SAW, AHP, and FR methods, to investigate an earthquake-prone area, namely, the Kepahiang Regency, Bengkulu Province, Indonesia. The objectives of this study were as follows: (i) develop a landslide susceptibility mapping technique; (ii) apply the above three statistical algorithms/methods to compare the validation precision through the AUC and employ another evaluation approach for assessing error metrics, namely, the root mean square error (RMSE), mean square error (MSE), mean absolute error (MAE), mean error (Mean), standard deviation (StD) and correlation coefficient (R); and (iii) identify the most effective causative factor for the landslide susceptibility. Additionally, this work could help the government mitigate the destructive effects of landslides, save lives, and minimize the damage caused by landslides in the Kepahiang Regency, Indonesia. Moreover, a reliable multivariate statistical method would be valuable for providing insight into accurate landslide susceptibility mapping and improving the ability to map potential landslide hazards.

2 Overview of study area

The Kepahiang Regency is located in the northeastern part of Bengkulu Province, Indonesia. It lies between 3°33′0″S to 3°43′0″S and 102°29′0″E to 102°46′0″E (Fig. 1a) and covers an area of approximately 7046 km2. According to statistical data from 1941–2021 (BPS-Statistic of Kepahiang Regency 2021), the total population of this regency is 453,984 people, yielding a population density of 1843 persons/km2 (BPS-Statistic of Kepahiang Regency 2021). The area is mountainous with a complex geological structure (Hadi and Siswanto 2016; BMKG 2022). The study area is situated in a high mountainous region surrounded by deep valleys and Kaba volcanic rock and Quaternary volcanic formations (Fig. 2g). The elevation ranges from 500 to 2500 m above mean sea level (Fig. 1a). There are three major rivers in the area: Air Musi, Air Brimming, and Air Sengai (Fig. 2g). The slope ranges from 0.4° to 65°, and the average wind velocity is approximately 7.03 km/hour. During the monsoon season from October to May, the study area receives high precipitation, with an average value of 3080 mm/year (Fig. 2i) (BMKG 2022). The Kepahiang Regency occurs in an earthquake-prone area along the Sumatran fault segment with a high slip fault rate of approximately 6.1 mm/year (Sieh and Natawidjaja 2000; BMKG 2022). The three main groups of active fault zones are the Musi–Keruh fault, Despetah fault, and Babakan fault (Sukisno and Muin 2012; BMKG 2022). Figures 1b–e show the topography as documented in previous landslide studies based on field investigation data and official documentation records.

Fig. 1
figure 1

a Location of the study area; b, c previous landslides (source: DEM-NAS – Google Pro image); d, e previous landslides in the study area (source: field visits). Photographs were taken during the field visit

Fig. 2
figure 2figure 2

Landslide causative factors considered in the study area: a Altitude; b slope degree; c slope aspect; d plan curvature; e profile curvature; f distance from the road; g geology; h distance from the fault; i rainfall; j peak ground acceleration (PGA); k time-averaged shear-wave velocity to the 30-m depth (Vs30); l amplification factor; m predominant frequency; n ground shear strain

3 Data for landslide susceptibility prediction

3.1 Data construction

In this study, a landslide susceptibility assessment was performed by generating a landslide inventory based on available causative factor data. The inventory offers essential information to evaluate various aspects of landslides, including types, locations, frequencies, and triggering factors (Wieczorek 1984). Different methods, such as satellite imagery-based and field investigation techniques, have been utilized to identify and compile a comprehensive landslide inventory (Wieczorek 1984; Petley et al. 2005; Mind’je et al. 2020; Mersha and Meten 2020). A total of 63 landslide samples were identified using remote sensing techniques, including a digital elevation model of Indonesia, namely, DEM-NAS, with a resolution of 0.27 arc seconds, Google Earth Pro, and field observations (as shown in Fig. 1). Of these samples, 11 were obtained through field visits coordinated by the Department of Physics and Geophysics of Bengkulu University, Indonesia. An additional 52 landslide samples were collected based on DEM-NAS and Google Earth Pro data. To create training and validation datasets, these selected landslide samples were randomly split into fractions of 80% and 20% for training and testing purposes, respectively, following the approach of Mind’je et al. (2020) and Mersha and Meten (2020). On the basis of a similar approach suggested by Tang et al. (2020), 63 non-landslide samples were also compiled using Google Earth Pro and DEM-NAS. These non-landslide samples were divided into training (80%) and validation data (20%). All datasets were then merged for model validation. Both the training and validation datasets were represented in binary form, with a value of 0 assigned to non-landslide locations and a value of 1 assigned to landslide locations. Next, the dataset containing the landslide catalog and causative factors underwent transformation within ArcGIS 10.8.1 software, resulting in a dataset with dimensions of 2000 columns and 1600 rows. Within the study area, each pixel was covered by an 11 × 11-pixel window. Approximately \(\pm\) 20,000 pixels were analyzed in the study area. Additionally, 63 non-landslide points were extracted to construct a network model dataset, including image edge pixels to enhance the accuracy of the LSM. The entire dataset was divided into two segments, with 80% allocated for model training and the remaining 20% allocated for evaluating the LSM predictive capability, as outlined by Azarafza et al. (2021) and Tang et al. (2020).

3.2 Landslide causative factors

A landslide susceptibility map was constructed based on the relationships between the considered landslide causative factors and landslide inventory data (Tang et al. 2020; Dou et al. 2015; Das et al. 2013). In this study area, causative factors were identified based on the available field data and official data. All causative factors were acquired from DEM-NAS, transformed into raster format and harmonized to 30 × 30 m cell pixels using ArcGIS 10.8.1, as detailed in Additional file 1: Table S1. Then, we separated the causative factors into the following categories: topographical category (altitude, slope degree, slope aspect, plan curvature, and profile curvature), geological category (geology and distance from faults), hydrological category (annual rainfall), and seismic category (peak ground acceleration (PGA), time-averaged shear-wave velocity to the 30-m depth (Vs30), predominant frequency, amplification factor, and ground shear strain (GSS)). These factors were selected for analysis in this study, as shown in Fig. 2.

3.2.1 Topographic category

The altitude is a crucial factor in identifying changes in slope conditions and measuring the direction, steepness, decline, or growth over a specific distance (Chang et al. 2019; Mind’je et al. 2020). It can be classified into five categories (Fig. 2a). The slope degree is a crucial factor in areas prone to landslide failure, determining the stress distribution on hillslopes (Donnaruma et al. 2013; Chen et al. 2018) (Fig. 2b). The slope aspect indicates the slope face state, which is related to landslide events due to water control and vegetation moisturization. The plan curvature describes the geometry of the slope, while the profile curvature describes the maximum parallel slope divided into three classes (Fig. 2d). The distance from the road is another contributing factor to landslide occurrence, and some landslide events are related to construction conditions in mountain areas, such as slope cutoffs for highways or construction purposes (Fig. 2f).

3.2.2 Geological category

Geology imposes a considerable influence on the occurrence of landslides (Chigara 2002; Vasudevan and Ramanathan 2016). The geology factor is divided into six types (Fig. 2g). The distance from faults is correlated with geology, affecting the structural plane and rock strength (Has et al. 2010; Gemitzi et al. 2011). The distance from faults is classified into five classes (Fig. 2h).

3.2.3 Hydrological category

Rainfall is a crucial factor in triggering landslides, as it influences the development and occurrence of events (Lin 2015; Wu et al. 2016; Lee 2017; Jaelani et al. 2020). As a result, moisture in soil increases, reducing soil cohesion and increasing the likelihood of landslides (Vasudevan and Ramanathan 2016; Jaelani et al. 2020). Data on the average annual rainfall from 1914–2021 were collected from the Indonesia Ministry of Climatology, Meteorology, and Geophysics (BMKG 2022) and divided into eight classes, as shown in Fig. 2i.

3.2.4 Seismic control category

Seismic control categorical variables are crucial in determining landslide events, as earthquake-induced landslides are affected by the earthquake magnitude (Hsieh et al. 2014; Wu et al. 2019; Martino et al. 2022). The correlation between the landslide density and PGA has been reported in previous studies (Jibson and Keefer 1994; Keefer 1984; Meunier et al. 2008). The seismic control category in this study encompasses earthquake factors, which are divided into two classifications: non-field and field causative factors.

The non-field causative factors include the causative factors of the PGA and Vs30 (Fig. 2j, k). The PGA is equal to the maximum ground vibration or highest acceleration during earthquake shaking (Das et al. 2013; Yin 2014). To determine the PGA value in this study, earthquake data were retrieved from the United States Geological Survey (USGS), International Seismological Center (ISC), and BMKG. Earthquake events were selected for the period from 1914 to 2021, with average magnitudes of M > 5.0 and depths of < 70 km. We used the total probability seismic hazard approach (PSHA) for the calculation (Additional file 1: Note S1 for the detailed equation). Then, we used MATLAB software to acquire PGA data (Fig. 2j). The PGA value classification was derived from the Ministry of Disaster Management Bengkulu (BNPB). A high PGA value indicates an unsafe (high-risk) area where an earthquake is more likely to occur and a landslide is more likely to be triggered (Das et al. 2013; Yin 2014). Then, we also selected Vs30 as a causative factor. The Vs30 indicator is another useful geotechnical parameter for seismic analysis, indicating the magnitude of the shear modulus and elastic properties (Allen and Wald 2007) (Fig. 2k). To determine the causative factor of Vs30, data were collected from the USGS database, with an observational point grid of approximately 1 km. A low Vs30 value suggests a higher risk of an earthquake or landslide in the area, which is related to the stiffness of the rock as well as the slope conditions (Brain et al. 2015).

The field factors include the predominant frequency, amplification factor, and GSS (Fig. 2l and m). The predominant frequency is related to the depth and bedrock (Hata et al. 2015) (Fig. 2m). The amplification factor is determined by the topography and geological factors (Fig. 2l). Warnana et al. (2011) found that in seismic hazard analysis, a high amplification value indicates a large thickness of soft soil or sediment (Fitri et al. 2018; Oliveira et al. 2008; Ishihara et al. 1996). Then, we also selected the GSS as a causative factor (Fig. 2n). Moreover, the GSS is a causative factor reflecting the subsurface structure and soil characteristics (Damayanti and Sismanto 2021; Ishihara et al. 1996). Microtremor data were used to analyze GSS values, resulting in amplification factor and predominant frequency values. Field data were collected from the Department of Physics and Geophysics, Bengkulu University, Indonesia. The GSS was calculated as described below (Nakamura 2008) (Additional file 1: Note S2). Then, Geopsy software was used to analyze the amplification factor and predominant frequency for determining the GSS value.

4 Methodology

4.1 Overview

In this paper, we proposed a multivariate statistical method to improve the prediction accuracy of LSM in Kepahiang Regency, Indonesia. Figure 3 shows the overall workflow of this study, which involves five steps: (1) preparation of a landslide inventory and causative factors; (2) feature selection of the causative factors using multicollinearity and Pearson correlation analysis with rank-based regression; (3) applying a modeling process using the AHP, SAW, and FR methods; (4) generation of landslide susceptibility maps; (5) validation and comparison of each map using the AUC technique and another evaluation method, thereby assessing error metrics, i.e., RMSE, MSE, MAE, Mean, StD and R.

Fig. 3
figure 3

Study flowchart for landslide susceptibility mapping

4.2 Multicollinearity and Pearson analysis

In this study, we utilized multicollinearity and Pearson correlation analysis with regression rank to select the best causative factors for obtaining better-quality landslide susceptibility maps. Multicollinearity among the selected factors was assessed based on the variance inflation factor (VIF) and tolerance (TOL) using training data (Roy et al. 2019; Mallick et al. 2021). A TOL value of < 0.1 and a VIF value of > 5 indicate the presence of multicollinearity problems, which should be excluded in the process of landslide susceptibility mapping (Mallick et al. 2021) (Fig. 4). Additionally, we employed Pearson correlation analysis to identify appropriate independent causative factors, excluding factors with a correlation value > 0.7, indicating high collinearity (Roy et al. 2019; Kalantar et al. 2020) (Fig. 5). Finally, to rank the importance of the selected factors in the study area, we used regression analysis (Soma et al. 2019). Detailed equations are provided in Additional file 1: Note S3.

Fig. 4
figure 4

Multicollinearity test results for the 15 landslide causative factors. Disr (distance from the road), Pro (profile curvature), Pln (plan curvature), DisF (distance from the fault), A0 (amplification factor), f0 (frequency domain), Vs30 (shear velocity to the 30-m depth), Geo (geology), GSS (ground shear strain), Elv (elevation), Altd (altitude), Slpa (slope aspect), PGA (peak ground acceleration), Rain (annual rainfall), and Slpd (slope degree) are causative factors in this study

Fig. 5
figure 5

Pearson correlation analysis between the landslide causative factors. The blue circle with a value less than 0.7 indicates a major causative factor considering the threshold for appropriate features

4.3 AHP method

The AHP method is a component of the overall MCDA, enabling decision-makers to create a scale of criteria for alternative solutions. The basic concept of the AHP method is based on pairwise matrix comparison of index criteria, with which decision-makers can assess the relative importance or preference between criteria or alternatives. In the AHP method, numerical values are assigned to comparisons, determining the priority of individual factors and subfactors causing landslides (Fig. 6) (Additional file 1: Table S2). The Saaty scale is commonly used, in which numerical values ranging from 1 to 9 are assigned, with the detailed equation provided in Additional file 1: Note S4 (Saaty and Vargas 2001; Saaty 1994).

Fig. 6
figure 6

Parameter weights of the AHP method

4.4 SAW method

The goal of the SAW method is to find the weighted sum of performance ratings using specific criteria and a decision matrix to evaluate each causative factor (Podvezko 2011). The SAW method was employed for assessing the susceptibility or potential risk of landslides by considering the various influencing factors or criteria and ranking the alternatives based on the multiple criteria or factors (Kusumadewi and Guswaludin 2005) (Fig. 7) (Additional file 1: Table S3). The SAW equation is detailed in Additional file 1: Note S5.

Fig. 7
figure 7

Parameter weights of the SAW method

4.5 FR method

The basic concept of the FR method is the computation of the ratio and indication of the possibility of landslide occurrence or nonoccurrence for each causative factor class or interval (Li et al. 2017). We used this method to evaluate the causative factors and assess their relationship with the landslide distribution. Then, we divided the landslide frequency based on the non-landslide frequency for each factor class or interval. Each class frequency ratio was calculated using the relationships for landslide events (Oh et al. 2017). An FR value of 1 indicates an average value of the landslide occurrence in the total area. An FR value less than 1 indicates a low correlation between landslide occurrence and a given causative factor in the total area; the detailed equation is provided in Additional file 1: Note S6. Conversely, an FR value greater than 1 indicates a high correlation between landslide occurrence and a given causative factor in the total area (Kayastha et al. 2013) (Additional file 1: Table S4).

4.6 Validation Model

Validation models are an important and necessary part of the process of creating landslide susceptibility maps. The validation method was used to assess the model quality and confirm the accuracy of landslide predictions (Chen et al. 2018). In this paper, we calculated the AUC value (Fig. 8) as well as RMSE, MSE, MAE and StD values to evaluate the LSM performance (Figs. 9, 10, and 11). The AUC curve in this study was plotted for both the training and testing datasets, with AUC values ranging from 0 to 1. Values closer to 1 indicate a highly accurate or superior model (Bui et al. 2016). Further details and specific equations are explained in Additional file 1: Note S7.

Fig. 8
figure 8

AUC curve validation for the AHP, SAW and FR models. a Training sample set; b testing sample set

Fig. 9
figure 9

Loss value and MAE (mean absolute error) of the training and testing datasets: AHP (A, B), SAW (C, D), and FR (E, F)

Fig. 10
figure 10

Evaluation error metrics, namely, MSE (mean square error) and RMSE (root mean square error), Mean, and StD (standard deviation), of the AHP, SAW, and FR methods

Fig. 11
figure 11

Correlation between the true and predicted values for the three algorithms

5 Results

In this section, we aim to explore the feasibility of applying the AHP, SAW, and FR methods for creating LSMs. In addition, the selection of causative factors is a crucial aspect in ensuring the quality of susceptibility assessment in this study. We also evaluated and compared the performance of each model using various evaluation metrics.

5.1 Evaluation of the landslide causative factors

In this paper, SPSS and Software-R were used to evaluate the collinearity of the landslide causative factors. The choice of causative factor is a key aspect that influences the quality of the resultant landslide susceptibility maps (Roy et al. 2019). Although various methodologies can be used for selecting causative factors, there are still no specific criteria for ranking these causative factors (Costanzo et al. 2014). Therefore, we chose multicollinearity and Pearson correlation analysis followed by regression-based ranking (Irigaray et al. 2007). The multicollinearity results of the VIF test showed that all causative factors had values less than 5, and the TOL test results showed that all causative factor values were higher than 0.1. The maximum TOL and VIF values were 0.891 and 4.021, respectively (Fig. 4). As recommended by Roy et al. (2019) and Mallick et al. (2021), none of the values for the causative factors obtained in this study indicated multicollinearity problems. These causative factors could thus improve the accuracy of the generated landslide susceptibility maps. The TOL and VIF test results for all causative factors are shown in Fig. 4. In our results, there was no significant reduction in the entropy of a given category or no collinearity with another category. Then, the Pearson correlation analysis results showed that 15 causative factors had values mostly less than 0.7. The causative factor of the distance from the road exhibited a value of 0.68, close to the maximum value for the appropriate attribute feature but still acceptable for consideration in landslide susceptibility mapping (Fig. 5). Detailed Pearson correlation assessment results for all causative factors are shown in Fig. 5. Furthermore, regression was used to rank the best causative factors for assessing LSMs in this study. Through calculation, the top three factors in terms of importance are the slope degree, rainfall, and PGA, with details shown in Fig. 12 and Additional file 1: Table S5. Therefore, based on the analysis results, in our research, all 15 causative factors were considered for creating LSMs.

Fig. 12
figure 12

Ranking of the causative factors

5.2 Evaluation of the AHP, SAW, and FR methods

In this research, we evaluated and compared three statistical methods, and we then used the AHP, SAW and FR methods to create LSMs of the study area. First, with the use of the AHP method, weights were assigned to each of the landslide-controlling factors using pairwise comparisons (Saaty 1994). Accordingly, each causative factor was assigned a score from 1 to 9 based on its relative importance (Saaty and Vargas 2001). If the parameter on the x-axis is more important than that on the y-axis, the value varies between 1 and 9. However, when the factor on the y-axis is more important, the reciprocal values range from 1/2 to 1/9. AHP calculation results were obtained using ArcGIS 10.8.1 software with the integrated raster calculator spatial analysis tool, as shown in Fig. 6 and Additional file 1: Table S2. Regarding SAW evaluation, with the use of the multicriteria decision approach, weights were determined for the different factors employed in landslide susceptibility mapping (Kusumadewi and Guswaludin 2005). Figure 7 and Additional file 1: Table S3 provide the calculated SAW weight scores for all landslide causative factors. The highest weight score of 0.182 was assigned to the slope degree (Fig. 7). Additionally, we obtained FR calculation results to evaluate the causative factors, assess their relationship with the landslide distribution, and create landslide susceptibility maps (Li et al. 2017). Additional file 1: Table S4 shows the calculated FR values for all causative factors. For example, in areas with elevations of > 1000 m, the FR value was > 1, indicating a higher possibility of landslide occurrence. Finally, the AHP, SAW and FR values were examined to validate the model accuracy, which is explained in Sect. 5.4.

5.3 LSM

The LSM is a visualization of the results for predicting the landslide susceptibility. In this study, we created LSMs based on the AHP, SAW, and FR methods using ArcGIS software. Subsequently, following the Jenks natural breakpoint classification algorithm, the LSMs in this study were divided into four susceptibility groups in ArcGIS: very high, high, moderate, and low.

The very high- and high-susceptibility areas generated by the AHP method accounted for 21.5% and 40.4%, respectively, of the entire study area. The moderate- and low-susceptibility areas constituted 27.0% and 11.1%, respectively, of the whole area (Figs. 13a and 14a, respectively). Similarly, the very high- and high-susceptibility areas generated by the SAW method accounted for 12.3% and 37.1%, respectively, of the whole study area. The moderate- and low-susceptibility areas accounted for 40.3% and 10.3%, respectively, of the entire area (Figs. 13b and 14b, respectively). Finally, the FR method yielded very high- and high-susceptibility areas that accounted for 28.4% and 32.4%, respectively, of the entire study area. The moderate- and low-susceptibility areas accounted for 23.5% and 15.7%, respectively, of the whole area (Figs. 13c and 14c, respectively). The analysis of the LSMs of this study area showed that the proportion of moderate-susceptibility areas is the highest among the four methods. Very high-vulnerability zones were found in the Kabawetan, Kepahiang, and Tebat Karai districts (Fig. 14). These areas have a high possibility of sliding due to their high average precipitation, which also triggers earthquakes. Moreover, most of the study area is mountainous with high hills, rainfall, and seismic activity, which significantly influences landslide occurrence. Given that Kabawetan and Kepahiang districts are located in high-susceptibility zones and are densely populated areas, it is crucial to raise awareness regarding the potential or dangerous effects of landslides.

Fig. 13
figure 13

Simulated landslide percentages with the three algorithms

Fig. 14
figure 14

Landslide susceptibility map of the study area obtained using the three methods: a AHP; b SAW; c FR

5.4 Validation

Model validation is an essential component of the development of landslide susceptibility mapping. First, we divided the dataset into training and testing datasets following an 80:20 ratio (Azarafza et al. 2021). Next, we imported the training dataset into the AHP, SAW, and FR methods to determine the hidden relationships between the landslide causative factors and landslide occurrence. During model evaluation, we validated the performance of LSMs using the AUC technique and assessed the accuracy and error metrics using both the training and testing datasets (Fig. 8). The AUC curve was created using the R programming language. The x-axis of the AUC curve indicates specificity, representing the probability of a non-landslide point, while the y-axis indicates sensitivity, representing the probability of correct prediction. An AUC value close to 1 indicates a more accurate prediction result (Bui et al. 2016). The success rates on the training dataset for the AHP, SAW, and FR methods were 0.866, 0.838, and 0.812, respectively (Fig. 8a). Similarly, employing the testing dataset to obtain prediction rates for the AHP, SAW, and FR methods, we found values of 0.863, 0.833, and 0.807, respectively (Fig. 8b). Based on the results, it was evident that the AHP algorithm exhibits the highest accuracy, sensitivity, and reliability among the landslide predictive models. Additionally, we adopted another evaluation technique, i.e., assessing the error metrics for each algorithm method (Figs. 9, 10, and 11). Figure 9 shows the variation curves of the model accuracy loss for the four methods using the training and testing datasets. In addition, Fig. 10 shows the analysis of the error metrics of the RMSE, MSE, Mean and StD for each algorithm. Further discussions and comparisons of the model performance are presented in Sect. 5.5.

5.5 Performance of the models

In our research study, we established four methods to create LSMs. The accuracy of the LSM models was assessed using AUC validation. Figure 8 shows the AUC curves for both the training and testing datasets. Among the four models, the AHP method achieved the highest AUC value (0.866 and 0.863) for both the training and test datasets, respectively, followed by the SAW and FR methods (Fig. 8b, c). However, relying solely on the AUC as a metric might not be the optimal approach for determining the accuracy of each algorithm or guaranteeing higher spatial accuracy (Parra et al. 2023). Then, by incorporating multiple evaluation metrics, we aimed to gain a deeper understanding of the model performance and accuracy of each algorithm method. Therefore, to provide a comprehensive evaluation, we utilized other metrics, such as loss accuracy, MSE, RMSE, Mean, StD and R (Figs. 9, 10, and 11), to analyze the accuracy of the models based on both the training and testing datasets.

The loss curves for the training and testing datasets are shown Fig. 9, comparing the loss and epoch. In this study, we set the initial increment to 0.1 to dynamically decrease the number of the model operations (Fig. 9), as recommended by Azarafza et al. (2021) to achieve a better-quality LSM. Then, in the testing process and loss vs. epoch comparison, the loss value of the FR method decreased in 5 iterations and then stabilized at approximately 0.005 (Fig. 9e). This condition may affect the consideration of spatial features in the acquisition of pixel sequences, but it still follows the normal trend for creating LSMs (Azarafza et al. 2021). However, in the training process loss vs. epoch comparison of the SAW method, the loss value decreased in the first five iterations and then stabilized closer to zero (Fig. 9c). Similarly, the AHP method also showed a loss with epoch value that decreased in the first five iterations and stabilized close to zero, with a trend change that remained more stable than that of the other methods (Fig. 9a). The fixed epoch number in this study ranged from 1 to 30, with a step size of 5. In Fig. 9e, we found that the loss function curve of the FR model was unstable but still approached normal limits when the epoch was 10. Finally, we could conclude that in terms of the loss accuracy (loss–epoch curve), the loss function curve of the AHP model remained stable when the epoch was 5 (Fig. 9a).

Moreover, Fig. 9 shows the accuracy curves for the training and testing datasets in terms of the MAE and epoch comparison. The analyzed results of the MAE indicated that the AHP model exhibits a slight change in the trend (Fig. 9b). It remained consistent with a stable decrease during the five iterations, with a MAE value closer to 0.200. Similarly, the change trends of the MAE and loss were consistent for the AHP method in regard to the prediction accuracy for both the training and testing datasets. The unstable conditions of the SAW and FR methods (Fig. 9d, f) may be due to the acquisition of pixel sequences and the absence of overfitting problems. Therefore, based on the loss with epoch and MAE vs. epoch comparison, the parameter settings for the AHP algorithm are highly reasonable, followed by the SAW and FR methods, for creating LSMs.

In this study, we also adopted another evaluation approach by referencing Nguyen et al. 2019, to assess the error in every algorithm (Fig. 10). The error evaluation entailed the comparison of the MSE and RMSE values. The prediction values varied between 0 and 1 (Nguyen et al. 2019). The results indicated that the maximum values of the MSE and RMSE are obtained by the AHP method (0.002, 0.038) (Fig. 10a), followed by the SAW and FR methods (Fig. 10c, e). Furthermore, we evaluated the frequency versus error, visualizing the values of the Mean and StD to assess the LSM quality (Fig. 10b, d, f). The results showed that using the AHP method also resulted in the minimum values (Fig. 10b). The minimum StD value indicates that the developed model achieves favorable predictive ability in creating LSMs, signifying the best prediction performance and high precision (Nguyen et al. 2019). The mean rates for the AHP, SAW, and FR methods were 0.000, 0.002, and 0.021, respectively (Fig. 11b, d, f, h). Similarly, employing the StD rates for the AHP, SAW, and FR methods, we obtained values of 0.000, 0.075 and 0.276, respectively (Fig. 11b, d, f). Finally, based on frequency vs. error analysis, the AHP method demonstrated a better performance than the other algorithm methods in this study. To further verify the prediction reliability of LSMs, we conducted a correlation analysis between the predicted and true values, as shown in Fig. 11. The correlation coefficient based on the FR model was the lowest, with a value of 0.986, indicating that the accuracy of the FR model was lower than that of the other methods. In contrast, the correlation coefficient of the AHP model was 0.998, which is significant and reveals a high prediction accuracy. The correlation coefficient showed that the AHP method is better than the other methods. Then, the correlation coefficients of the methods could be ranked as AHP > SAW > FR overall. Finally, the accuracy analysis with error metrics indicated that most of the evaluation results show that the AHP method ranks first in terms of accuracy for creating LSMs of the study area.

6 Discussion

Spatial prediction of landslides has been considered for assessing and mitigating risks (Lee et al. 2012). Although other methodologies involving geological techniques have been proposed for the study area, the accuracy of the predictions remains a controversial issue. In this research, we addressed this concern by evaluating and comparing several multivariate statistical methods. In general, FR method outperforms the other methods in terms of effectiveness. However, this might lead to computational challenges in data processing. Nevertheless, based on the AUC results (Fig. 8), the FR method could still be deemed acceptable for creating LSMs, with an AUC value exceeding 0.8, representing an accuracy of over 80% (Bui et al. 2016). Regarding the AUC results shown in Fig. 8, all algorithms in this study demonstrated high values. Nonetheless, relying solely on the evaluation of this metric may not be the optimal strategy to guarantee a higher level of spatial accuracy during model evaluation (Nguyen et al. 2019). Therefore, an additional evaluation technique is needed. In this study, RMSE, MSE, MAE, Mean, StD, and R provided more detailed insights into the differences in accuracy among the utilized methods. The FR method yielded AUC values ranging from 0.807 to 0.812, with a mean value of 0.021 and an StD value of 0.276. The SAW method yielded AUC values ranging from 0.833 to 0.838, with RMSE and Mean values of 0.077 and 0.002, respectively. Finally, the AHP method exhibited AUC values ranging from 0.863 to 0.866, with an MSE value of 0.002, Mean value of 0.000 and R value of 0.998. Ultimately, based on the AUC and other evaluation results (RMSE, MSE, MAE, Mean, StD, and R), the results indicated that the AHP method outperforms the other algorithms in terms of prediction accuracy. The AHP method demonstrated the highest prediction accuracy, followed by the FR and SAW methods. Despite being one of the traditional methods for creating LSMs of this study area, the AHP method provided a superior performance relative to the other algorithms. Its higher prediction accuracy and precision establish it as the most reliable method for creating LSMs in the Kepahiang Regency, Indonesia. These findings demonstrate the effectiveness of the AHP method in predicting the landslide susceptibility and provide valuable insights for landslide risk assessment and mitigation in the study region.

Then, based on the assessment of the analysis results in ArcGIS software, we calculated the percentage of historical landslides and created LSMs by four statistical methods in four categories (Figs. 13 and 14, respectively). The results showed that the majority of landslides could be classified in the very high and moderate susceptibility categories, with most occurrences observed in the Kabawetan, Kepahiang, and Tebat Karai districts (Fig. 14). Then, to identify the most critical causative factors and better understand the mechanism of triggering landslides in this study, we employed regression ranking techniques (Soma et al. 2019). The regression results indicated that slope, rainfall, and peak ground acceleration (PGA) are the most important causative factors and key contributors to landslides in the study area (Fig. 12 and Additional file 1: Table S5). Furthermore, based on information gleaned from historical meteorological data, the landslides in this study area are primarily caused by slope and rainfall due to the effect of mountainous areas. During the rainy season (October until May), landslides are the most common occurrence, with an annual precipitation ranging from 2000 to 4000 mm/year (BMKG 2022). The high rainfall softens the soil under increasing water saturation, leading to cracks and water sensitivity in the ground, resulting in static liquefaction and collapsibility, making the area vulnerable to landslides. Additionally, the increase in the pore water pressure further increases the weight of the rock mass, creating a slide zone on the slope and causing slope failure or landslides (Hsieh et al. 2014). During our field visit, we observed open cracks on some slopes, indicating the potential for future slope failure or landslides. Moreover, the PGA is one of the main causative factors contributing to the occurrence of landslides in this area. PGA values can represent part of the soil or rock conditions, and landslides are triggered when the soil strength is diminished during earthquake events (Yin 2014). The Kepahiang Regency area is part of the Sumatran fault belt system, which is tectonically and seismically active. For instance, on 15 December 1979, an earthquake with a magnitude of 6.6 and a depth of 33 km caused major landslides and the collapse of 500 houses in this area (Hadi and Siswanto 2016; BMKG 2022). The frequent tectonic and seismic activities in the region add to the susceptibility of landslides in the study area. Additionally, the cutting of roads and farming activities contribute to slope failure or landslides in this study area.

7 Conclusions

Landslides rank among the most severe natural disasters in the world, causing significant devastation to human lives and critical infrastructure. Therefore, to prevent casualties resulting from potential landslides, we assessed the use of multivariate statistical methods for landslide susceptibility mapping in the Kepahiang Regency, Bengkulu Province, Indonesia. In this study, First, we created a landslide catalog dataset by using satellite imagery and field inquiry data, thereby combining non-landslide data to enhance the accuracy of LSMs. We utilized ArcMap, SPSS, and Software-R for data processing. Additionally, we considered 15 causative factors for constructing LSMs. We employed multicollinearity analysis, Pearson correlation analysis, and regression-based ranking to identify the most influential causative factors. Our results indicated that the top three causative factors are the slope degree, rainfall, and PGA, which play significant roles in susceptibility assessment in the study area. Moreover, this condition matches the field situation of the study area (with high annual precipitation), and combined with frequent tectonic and seismic activities, it is particularly vulnerable to landslides. Then, we validated these methods using multiple metrics, including the AUC, MAE, MSE, RMSE, Mean, StD, and R. The AUC validation results showed that the AHP method achieved the highest AUC score of 0.863, followed by the SAW and FR methods, at 0.833 and 0.807, respectively. We also assessed the accuracy and error using MSE and RMSE values, with the AHP method providing the highest values. The mean rates for the AHP, SAW, and FR methods were 0.000, 0.002, and 0.021, respectively. Regarding the methods, overall, the correlation coefficient rankings were as follows: AHP (0.998), SAW (0.992), and FR (0.986) methods. Considering comprehensive accuracy evaluation, it could be concluded that all three models (AHP, SAW, and FR) demonstrate satisfactory performance. However, the AHP method stood out as the best method for creating LSMs based on all evaluation metrics. In addition to advancing LSM development, the evaluation using multiple metrics could facilitate comprehensive model performance assessment. We categorized LSMs into four zones: very high, high, moderate, and low susceptibility. Our results revealed that zones with increased susceptibility are more common in the Ujan Mas Seberang Musi, Kepahiang, Kaba Wetan, and Tebat Karai districts, accounting for over 50% of the study area. Therefore, the significance of early warning of landslides must be strengthened to mitigate landslide risks and prevent property damage. In addition, the distribution of information on the landslide susceptibility among the public could help save lives and support the government in managing high-risk areas, development of infrastructure, and planning of sustainable land uses.