Landslide risk assessment integrating susceptibility, hazard, and vulnerability analysis in Northern Pakistan

The purpose of this study is to assess the landslide risk for Hunza–Nagar Valley (Northern Pakistan). In this study, different conditioning factors, e.g., topographical, geomorphological, climatic, and geological factors were considered. Two machine learning approaches, i.e., logistic regression and artificial neural network were used to develop landslide susceptibility maps. The accuracy test was carried out using the receiving operative characteristic (ROC) curve. Which showed that the success and prediction rates of LR model is 82.60 and 81.60%, while 77.90 and 75.40%, for the ANN model. Due to the physiographic condition of the area, the rainfall density was considered as the primary triggering factor and landslide index map was generated. Moreover, using the Aster data the land cover (LC) map was developed. The settlements were extracted from the LC map and used as the elements at risk and hence, the vulnerability index was developed. Finally, the landslide risk map (LRM) for the Hunza–Nagar valley was developed. The LRM indicated that 37.25 (20.21 km 2 ) and 47.64% (25.84 km 2 ) of the total settlements lie in low and very high-risk zones. This landslide risk map can help decision-makers for potential land development and landslide countermeasures.


Introduction
Geological hazards include landslides, debris flows, rockfalls, rockslides, mudslides, and rock avalanches, which are the most catastrophic natural hazards worldwide.Landslides are the most commonly occurring geological hazard which cause serious damages and destructions to infrastructures, threaten human lives, and the economy [1,2].Worldwide, landslides cause hundreds of thousands of deaths with hundreds of billions of lost (USD) each year [3].It is estimated that worldwide, approximately 1016 deaths and economic losses of about 4 billion US dollars occur due to landslides every year [4].The escalation in landslide-induced damages is attributed to increasing urbanization, unplanned development, deforestation, and the effects of climate change [5,6].Therefore, landslide susceptibility and risk assessment are important to develop and implement an effective solution to minimize its catastrophic consequences [7].The evaluation of risk for individual landslides has reached a relatively advanced stage, whereas the assessment of landslide risk on a regional scale has not been extensively explored in existing literature [8].The landslide probability map (LSM) indicates the probability of a hazard occurrence under the influence of different causative variables.The landslide risk assessment (LRA) indicates the elements (i.e., infrastructure, population, building, hospital, etc.) that are at risk due to the occurrence of a hazard in an area [9,10].
The fundamental methodologies employed in conducting a thorough analysis and assessment of landslide risk can be categorized into quantitative and qualitative approaches [11,12].Quantitative methods necessitate a minimum of three types of information: the probability of landslide occurrence, the count of exposed elements at risk, and the anticipated level of loss associated with these elements at risk (i.e., vulnerability) [7,9,13].As a result, risk is generally expressed as a function of hazard susceptibility and vulnerability parameters, with the latter derived from the elements at risk.Similarly, the LS susceptibility methods are also categorized into qualitative and quantitative methods [3,14,15].The analytical hierarchy process (AHP) and expert knowledge-based approaches are the most commonly used qualitative approach [16,17].However, these qualitative methods are mainly graded and based on the expert's evaluation, which is normally difficult to achieve high accuracy [18].Whereas, the quantitative techniques were found to be more objective compared to qualitative techniques.The quantitative techniques include weightsof-evidence (WoE), Shannon entropy (SE), statistical index (SI), and frequency ratio (FR) have been commonly used for the landslide susceptibility modeling (LSM) [19][20][21][22].However, these statistical approaches were relatively weak and complex to understand [23].With the modernization and wide use of remote sensing technologies, it is much easier to obtain landslide hazard-related data [24].Many researchers have proposed and adopted machine learning techniques to solve the complex correlation between landslides and get highly accurate results [25].The most commonly used machine learning techniques for the LS are: logistic regression (LR), artificial neural networks (ANN), support vector machines (SVM), decision trees (DT), convolutional neural networks (CNN) and random forests (RF) [24,[26][27][28][29][30][31][32][33].
The Hunza-Nagar valley is exposed to the occurrence of landslides because of physiographic and climatic conditions.This valley is located in the Karakoram mountain region, which has been frequently affected by mass movements [34].More than 70% of the Hunza-Nagar valley lies on the uplifted portion of main Karakorum thrust faults (MKT), making it more vulnerable to landslide occurrence.On the January 04, 2010 a catastrophic landslide occurred in Attabad valley.As a result of this event, 20 people died and more than three hundred houses were destroyed.The Attabad landslide blocked the Hunza river and submerged the 19 km of Karakorum highway [35].These kinds of events can threaten not only human lives but also the economy.In light of these challenges, the urgency of conducting a comprehensive landslide risk assessment study becomes evident.Despite the crucial need for such assessments, the existing body of research by various scholars, including Ahmed, et al. [36], Ali, et al. [37], Khan, et al. [38] and Ahmad, et al. [39] has predominantly focused on regional-level landslide susceptibility using different qualitative and quantitative approaches.While their studies have provided valuable insights, they are limited to the assessment of landslide susceptibility alone and lack a complete evaluation of hazards, vulnerability, and risk.Moreover, seminal works by Bacha, et al. [40] and Baig, et al. [41] have delved into landslide susceptibility and landscape-level hypsometry in the Hunza-Nagar region, offering valuable perspectives and a comprehensive understanding.However, there remains a significant research gap in the evaluation of hazards, estimation of vulnerability, and assessment of risk for this region.Recognizing these limitations, our study aims to fill this void by employing an integrated approach that encompasses landslide susceptibility, vulnerability estimation, and risk assessment.
This paper aims to develop the landslide risk map for the Hunza-Nagar valley.For this purpose, the landslide susceptibility maps were developed by using two machine-learning techniques (LR and ANN).The annual mean    dominant lithology of PML.The Rakaposhi Volcanic formation (RVF) comprises andesites, slate, loess, and phyllites.Tectonically this is located on the main Karakorum Thrust fault's uplifted portion as result of Indian and Eurasian plates collision (50 mya) [42].
Topographically the Hunza-Nagar valley is located in the Karakoram mountain ranges.The elevation ranges from 1763 to 7697 m.sl.The average range of the elevation is 3500 m.sl.The slope angles lie from 0 to 89° with an average slope of 30°.The presence of alluvial fans, flood plain, and old glacier moraines are the common geomorphological features of this area.Moreover, this area is highly susceptible to landslides because of its environmental and climatic conditions (Table 2).

Mapping units
The mapping units, crucial for assessing landslide susceptibility, serves as the smallest spatial primitive and can take the form of regular or irregular units.Common types of units in landslide susceptibility mapping include watershed units, slope units and grid units [43].Watershed units excel in evaluating floods and debris flow disasters, while slope units, capturing topographical, geological, and environmental conditions, offer theoretical advantages but face challenges in manual acquisition, especially for large areas [44,45].While the grid unit may not comprehensively represent the terrain environment, its calculation is straightforward and convenient, making it capable of dividing a large number of units, hence establishing its status as the most widely employed unit in landslide susceptibility assessments [46,47].Therefore, the minimum grid resolution was determined by adopting the criterion that the size of the minimum grid should be < 30 m, thus, the minimum size was set at 30 m.The guideline for determining the maximum size aims to maintain a sufficient number of cells with landslide cases as data samples, thereby mitigating potential model errors due to inadequate data samples [48].The total number of samples datasets are 276, positive (landslide locations) and negative (non-landslide locations) at a 1:1 ratio.

Multi-collinearity
The tolerance (TOL) and variance inflation factor (VIF) indicates the effects of correlation in a regression among the conditioning factors.It also indicates a problem, which exists when there is a high correlation among the conditioning factors.Therefore, TOL and the VIF (VIF) are the two most important indices for multi-collinearity assessment.The TOL and VIF test is used to check the multicollinearity between all the considered variables in this study.Normally, a TOL < 0.10 or a VIF > 5.0 indicates multicollinearity between the conditioning factors.

Logistic regression model
Logistic regression (LR) is a machine learning technique based on a multivariate statistical algorithm that is most commonly used to ordinal data, multinomial model, or binary datasets [58].The LR model is based on the relation between the considered dependent (hazards and non-hazards) variables and independent (conditioning factors) variables [59].The dependent variables are the presence or absence of a hazard at a given point (binary values of 1 and 0).Whereas the independent variables normally consist of categorical and continuous datasets.The mathematical expression for the LR coefficient can be express as, where is the occurrence probability based on the dependent variable (0 and 1) and z is the linear combination.
where l 0 is the intercept, n indicates the number of independent (i.e.landslides conditioning factors) variables and x shows the independent variables and their contribution to the occurrence of landslides [59].

Artificial neural network
In this study, an additional machine learning approach called ANN is used.The ANN is a strong mathematical method reflecting the computer layout of the human brain framework.The potential of ANN generates from the simulation of nonlinear numerical method components through teaching and learning techniques.It also has a phenomenal capacity to manage incomplete or inaccurate information and nonlinear and dynamic [60,61].In pattern recognition, identification, classification, autocorrelation, estimation, and other areas, the ANN method is mainly used.Based on ANN methods, multi-layer perception (MLP) is the most commonly used technique of machine learning for prediction and classification problems in various fields of research.The MLP consist of three main layers; (a) the input layer, (b) hidden layers, and (c) the output layer.The input layer is the first layer, which supplies the input parameters to the network; the third and last layer is the output layer that displays the study's outcomes while the hidden layer is present between the inputs and output layers [62].
The input and output layers are measured by the input of descriptive classes and different considered conditioning factors.Generally, a trial-and-error technique allows for the necessary number of hidden layers.The back-propagation (BP) algorithm was used in the hidden and output layers to use the MLP neural network with sigmoid transfer functions.The BP algorithm is a widely used approach to train (i.e., to evaluate weights) of the ANN models.All findings were then sent to the network, the model weights were calculated by considering eleven input layers, seven hidden layers, and one output layer to generate a landslide susceptibility map.The probability of the occurrence of the hazard lies between 0 and 1, with normalized values fall between 0 and 1.The mathematical expression for calculating landslide susceptibility is given as: where n is the number of landslides conditioning variables, w i is the weight coefficient of the landslides conditioning variables calculated by ANN, x i is the input value from each class of each conditioning variable.Here, T is the transpose of a matrix, and, in its simplest case, the output value GS is computed as: where is the threshold level, and this type of node is called a linear threshold unit.

Model validation
Evaluating the landslide susceptibility model is essential, as landslide susceptibility assessment lacks scientific significance without proper validation.The confusion matrix and Receiving Operative Characteristic (ROC) curve analysis are commonly employed methods to assess the effectiveness of landslide sensitivity models [48,63].ROC curves rely on confusion matrices, where sensitivity and specificity serve as the horizontal and vertical axes, respectively.The Area Under the curve (AUC) value represents the area under the ROC curve [39].The success and predictive capabilities of a model can be evaluated through the AUC values of the training and testing datasets.Whereas, the confusion matrix provides a clear depiction of the misclassification weights across different categories.Table 4 illustrates the of statistical calculations of the confusion matrix, including accuracy, precision, Recall, and F-score. (

Multi-collinearity assessment
The TOL and VIF were performed to check the presence of collinearity among the considered factors, as linear collinearity among the conditioning variables minimizes the predictive proficiency of a method [64].Generally, a TOL less than 0.10 or a VIF greater than 5.0 indicates multicollinearity among the conditioning factors [65,66].In this study, the multicollinearity analysis as carried out using the training data (70%) in SPSS software v. 26.It was found that the TOL and VIF coefficient values are less than 0.10 and 5.0, respectively, which indicates no collinearity has existed among the considered variables (Table 5).

Logistic regression
The results of LR revealed the highly contributed variables for influencing the occurrence of landslides were slope angle, elevation, aspect, geology, distance to faults, and distance to roads because the sig (p) values were less than 0.05.The slope is a significant conditioning factor that controls the surface velocity [67].The steeper slopes accelerate the detached materials in sliding down the mass body whereas the gentle slopes are considered to be more stable as compared to steep slopes.In the Hunza-Nagar valley, the slope angles range between 30° and 60° are more prone to landslide occurrence.The result is in accordance with Dahal, et al. [68].The elevation is an important topographic variable that affects the earth's surface by spatial variability of climatic conditions, erosion and weathering phenomena [67].In this area, the elevation that ranges from 2000 to 4000 m was found to be more susceptible for landslide occurrence as these classes account for 67% of the total landslide.The elevation from 2000 to 4000 m is covered by seasonal snow that starts melting in spring and flows into the Hunza River.This is a continuous process throughout the year and causes the instability of slopes by weakening the shear strength of the surface.In the case of aspect, the southward surfaces in this area were directly exposed to sunlight and hence more affected by mechanical and chemical weathering.This result is in line with the other studies [69,70].Geologically, the southern Karakoram metamorphic complex (SKM) showed a high significance to the landslide occurrence in the Hunza-Nagar valley.These rocks are highly fractured, jointed and deformed which are prone to slope failure [71].The uncontrolled blasting and excavation for the reconstruction of the Karakorum highway (KKH) led to many shallow landslides.Most of the landslides in Hunza-Nagar valley lies in close proximity to road, while, the remaining conditioning factors showed less influence on the occurrence and distribution of landslide based on their higher sig (p) (> 0.05) values [72,73].The Eq. ( 5) was used to calculate the probability of landslide occurrence in the study area.
(   6. Ultimately, the landslide susceptibility index for the logistic regression was calculated using Eq. ( 5).The probability of these values varies from 0 to 1 [74].The final susceptibility map values were subdivided into four classes; low, moderate, high, and very high (Fig. 4) using the natural breaks classification method [75].In this model, the percentage of very high, high, moderate, and low susceptible areas were 17.06%, 35.05%, 27.48%, and 20.44%, respectively (Fig. 5).

Artificial neural network model
To find the right ANN layout, the MLP network was checked by 11 different neurons in its special hidden layer, i.e., hidden neurons.The significance of the considered conditioning factors was analyzed using the ANN model for the landslide susceptibility assessment.The results of this analysis showed that the slope (14.50%), aspect (12.60%), distance to rivers (11.40%), elevation (10.50%), distance to the road (9.20%), and TWI (8.20%) have a great impact on the distribution and occurrence of landslide hazards in the study area (Table 6).While on the other hand side, the distance to faults (7.80%), NDVI (5.60%), curvature (5.30%), geology (4.20%), and SPI (3.10%) showed less contribution to the landslide occurrence.Finally, using the results the landslide susceptibility map is prepared.This map is classified into four classes using the equal interval classification method (Fig. 4).The percentage of these individual class is; low (14%), moderate (28%), high (27%), and very high (29%), respectively (Fig. 4).

Model validation results
After getting the results of logistic regression (LR) and artificial neural network (ANN) it is necessary to check the performance of the individual model and then to compare the result of each model.The ROC curve based on the confusion matrix was tested for both LR and ANN models.Figure 6 shows the performance of the ROC curve for the success rate of the LR and ANN model.This shows that LR and ANN models' success rate is 82.60% and 77.90%, respectively (Fig. 6).In terms of LR training datasets, the accuracy, precision, recall, and F1-score stand at 0.839, 0.857, 0.813, and 0.834, while the corresponding values for the ANN training datasets are 0.825, 0.80, 0.80, and 0.80 (Table 7).Whereas, the prediction rate of these models was found to be 81.60% and 75.40% for the LR and ANN models, respectively.For LR testing datasets, the validation metrics include accuracy (0.843), precision (0.837), recall (0.857), and F1-score (0.847).
Similarly, for ANN testing datasets, the values are 0.819, 0.814, 0.833, and 0.824.After the evaluation of LR and ANN models, LR outperformed with a success rate of 82.60% compared to ANN's 77.90%.In-depth analysis of training datasets confirmed LR's superior accuracy, precision, recall, and F1-score.Testing datasets further validated LR's excellence with accuracy metrics at 0.843, precision at 0.837, recall at 0.857, and F1-score at 0.847.The consistent high performance of LR, especially in comparison to ANN, solidifies its position as the preferred model for effective landslide susceptibility assessment and mapping in similar settings.

Landslide index
The key step for preparing a landslide index map (LIM) is the selection of landslide triggering factors in the study area.In view of the scenario of occurred landslides and literature review, the annual mean rainfall was taken into account as the main triggering factor [74,76,77].Most of the shallow landslides were found to be triggered by the rainfall events in the Hunza-Nagar valley.Therefore, rainfall density was considered as the primary triggering factor for landslide occurrence.The annual mean rainfall map was prepared using the precipitation data of the Pakistan metrological department (PDMA) from 2000 to 2015.The annual mean rainfall map was divided into three classes ranging from 650 to 860 mm/year.After preparing the landslide hazards triggering factors, the landslide susceptibility map was overlaid with the triggering factor in the ArcGIS platform.It is very important to select the same dimensionless scale for both the triggering and landslide susceptibility maps.To achieve these results the Eq. ( 6) was used [78].
where X ij is the standardized score of the i alternative and j attribute, X ij is the raw score and X max−j and X min−j is the maximum and minimum score for the j attribute, respectively [78,79].In the new scale, 0 corresponded to the lower score while 1 corresponded to the high score.This result gives us the landslide index map divided into four classes using the natural breaks method (Fig. 7).( 6)

Vulnerability assessment
To determine the risk assessment of landslides, vulnerability due to a landslide hazard is also perceived to be identical to the full loss of lives and properties or the absolute devastation of risk elements in a region [80].This model simplification is applied to enable the situation more achievable since there is usually little knowledge of particular properties' susceptibility or individual risk elements [81].Mathematically, landslide vulnerability can be express as, where D H is the stately (measured) or the predictable destruction to a component assumed the occurrence of a landslide hazard ( H ) [82].In this Eq.( 7) vulnerability is the likelihood of complete destruction of a particular component or the ratio of losses to an object caused by the landslide's occurrence [83].In each of these scenarios, vulnerability is described on a scale ranging from 0 to 1, 0 indicating no loss while 1 suggesting total loss or devastation [81].Generally, the vulnerable elements (element at risk) are expressed as heuristically (qualitative) and economically (monetary, quantitative) scales [84].While considering economic metrics, vulnerabilities are usually described in terms of component significance, including intrinsic, utilitarian, and monetary values.When illustrated heuristically, hazard vulnerability is defined in a qualitative phrase (descriptive), which implies the anticipated or definite risk factor exposure [84].Preliminarily, a GISbased Land Cover (LC) map was developed by using remote sensing techniques.For this purpose, Landsat collection-1 (Landsat 8) satellite imageries data acquired and used.This object was adjusted based on a 1:25,000-scale topographical sheet.The sample was re-sampled using the first polynomial conversion and the nearest neighbor algorithm to keep the parameters of initial image intensity unaffected [85,86].The maximum likelihood classification method was used for the LC image classification.This classifier has proved to be superior to standard classifiers in nearly all instances, including maximum likelihood and minimal ranges with high preciseness in overall gain [87].Finally, after the image classification process, a land-cover map for this region was acquired.Kappa statistics analysis was performed for the accuracy assessment.This is based on the discrete multivariate technique usually performed for (7) V H = P D H ≥ 0|L , 0 ≤ D H ≤ 1 the accuracy tests [88].This result showed that the accuracy of this Land Cover map is 94%.After the accuracy assessment, the land Cover map is divided into six classes: forests, settlements, agricultural land, bare land, water bodies, and snow cover.We considered the population density for the vulnerability map and hence extracted the settlement data from the land cover map.The total area covered by the settlement is 54.25 km 2 .This data was considered as the elements at risk.The LC map is classified into two classes; one class is the settlement and assigned as 1, while all the rest of the five classes were merged into one class and assigned as 0 (Fig. 8).

Landslide risk modeling
The purpose of the risk analysis is to assess the possibility that a particular hazard may cause damage and analyze the association between the occurrence of adverse events and the severity of the effects [89].Globally, different researchers have proposed the basic concept of landslide risk evaluation.Varnes [90] stated that the purpose of the risk assessment of landslides is to estimate the potential amount of loss due to hazards and the expected number of lives lost, people injured, property destruction, and economic activity disturbance.Therefore, landslide risk analysis is divided into two types, quantitative (probabilistic) and qualitative (heuristic) techniques [89].The quantitative approach attempts to assess the risk of casualties or the potential of destruction due to the mass movement [91][92][93][94].Implementing the likelihood of a failure involves a list of occurred landslides with their repercussions.In this study, a quantitative approach is used for the landslides risk assessment of Hunza-Nagar valley.To develop the landslide risk map, the acquired landslide index and vulnerability maps were merged and combined.Therefore, the final landslide risk map was generated for the Hunza-Nagar valley by using Eq.(8).In this equation, the landslides risk is the product of landslide index and vulnerability maps, which can be mathematically expressed as, where H G and V G are the landslide susceptibility and vulnerability probabilities for Hunza-Nagar valley.The developed risk index map was subdivided into categorical risk areas to support the visual interpretation and help identify the landslide risk areas more clearly.For classifying, a standard deviation classification method was used in ArcGIS software.This classification method is commonly suggested to allow the mean values to produce class breaks [19].Therefore, using these techniques, the landslide risk map was divided into four classes, i.e., (i) low, (ii) moderate, (iii) high, and (iv) very high (Fig. 8).Based on this map, the percentage of low and very high settlement areas to landslide risks is 37.25% (20.21 km 2 ) and 47.64% (25.84 km 2 ).While 5.40% (2.93 km 2 ) and 9.72% (5.27 km 2 ) of the total settlements lie in the moderate and high-risk zones (Fig. 8).

Discussion
The comprehensive landslide risk assessment in the Hunza-Nagar valley employed a rigorous methodology integrating multi-collinearity analysis, Logistic Regression (LR), Artificial Neural Network (ANN) modelling, and subsequent validation through AUROC curve analysis.The absence of collinearity among conditioning factors ensures the reliability of the models, providing a solid foundation for interpreting results.LR identified key variables-slope angle, elevation, aspect, geology, distance to faults, and distance to roads-as significant contributors to landslide occurrence.Steeper slopes, specific elevation ranges, southward aspects, and geological features were found to influence susceptibility.This aligns with studies such as Tesfa and Woldearegay [95], Shirzadi, et al. [96] and Ahmad, et al. [39], emphasizing the extensive importance of these factors in landslide susceptibility.Human activities, particularly road construction, demonstrated a noticeable impact on landslide distribution.Additionally, the ANN model reinforced these findings, emphasizing the importance of slope, aspect, distance to rivers, elevation, distance to roads, and TWI in landslide distribution.The resulting susceptibility maps effectively categorized areas into distinct risk zones, showcasing the model's efficacy.
AUROC curve analysis validated LR as the more effective model for mapping hazards, with a success rate of 82.60%, surpassing ANN's 77.90%.This critical validation step enhances the confidence in the reliability of the LR model for similar hazard mapping scenarios.This is consistent with Kalantar, et al. [97] findings in a similar geographical context, emphasizing the reliability of LR for hazard mapping.Additionally, Wang, et al. [98] also observed that the LR model exhibited superior performance compared to the ANN model, highlighting the possibility of enhancing model accuracy by selecting appropriate landslide samples for training.Despite this, Park, et al. [99] and Yilmaz [61] have found that the ANN model outperformed other models in their studies.Some of the possible reason are the geographical location, environmental conditions and availability of complete datasets.
The landslide index map, derived from annual mean rainfall as the primary triggering factor, offered nuanced insights into susceptibility, considering both triggering factors and topographic conditions.Akgun [100], Galve, et al. [101] and Promper, et al. [102] emphasize different factors such as seismic activity or land use changes.These differences underscore the regional variability in landslide triggers and the importance of tailoring approaches to local conditions.The standardized landslide index map, categorized into four classes, provides a comprehensive overview of LS hazards across the study area.The vulnerability assessment, incorporating remote sensing techniques to develop a Land Cover (LC) map, demonstrated high accuracy (94%).The incorporation of remote sensing techniques for vulnerability assessment is a common thread in studies such as Tan, et al. [103] and Michael and Samanta [104].Settlements, identified as elements at risk, formed the basis for further vulnerability analysis.
Quantitative risk assessment, integrating landslide susceptibility and vulnerability maps, resulted in a comprehensive landslide risk map.The categorization of risk zones into low, moderate, high, and very high-risk areas is a consistent theme across various studies, including Akgun [100] and [39].This map provides actionable insights for implementing targeted strategies to mitigate landslide impacts in the Hunza-Nagar valley and similar regions.This standardization allows for better comparability of results and aids decision-makers in developing targeted mitigation strategies.
The comprehensive landslide risk assessment in the Hunza-Nagar valley provides valuable insights, it is important to acknowledge certain limitations.The accuracy and reliability of our models are dependent upon the availability and quality of data.In this study, we relied on existing datasets for conditioning factors, and any inaccuracies or limitations in these datasets could impact the precision of our results.Additionally, the absence of real-time or continuous monitoring data introduces a temporal limitation, as landslide susceptibility and risk conditions may vary over time.Despite these limitations, our study contributes to the existing body of knowledge on landslide susceptibility and risk assessment.Recognizing these constraints allows for a more nuanced interpretation of our results and encourages future research endeavors to address these limitations for improved accuracy and applicability.

Fig. 1
Fig.1The location map of the study area

Fig. 3
Fig. 3 Landslide conditioning factors maps prepared for this study

Fig. 4
Fig.4 The landslide susceptibility maps of Hunza-Nagar valley a landslide map prepared by using the ANN model b landslide map prepared by using the LR model

Fig. 5 Fig. 6
Fig.5 The final LSM prepared by using LR and ANN model were classified into four classes; low, moderate, high and very high

Fig. 7
Fig. 7 Landslide Index map of the Hunza-Nagar valley and Roc curve analysis for this map

Fig. 8
Fig. 8 Landslide risk map of the Hunza-Nagar valley and the vulnerability map prepared for this region

Table 2
Different lithological formation of Hunza-Nagar valley

Table 3
demonstrates a brief description of the considered condition variables.

Table 3
Landslide conditioning factors use in this study Factors

Table 4
Confusion matrix measurements Precision TP ∕ TP + FP FP When the observed value is negative, but the predicted value is positive Recall (Sensitivity) TP ∕ TP + FN FN Where the actual value is positive, but the predicted value is negative F-score TN ∕ TN + FP TN Represents scenarios where both the actual and predicted values are negative

Table 5
| https://doi.org/10.1007/s42452-024-05646-2where NDVI is the raster NDVI values; TWI is raster classified TWI values; elevation is classified elevation raster values; faults is classified fault raster values; river is classified river raster values; curvature is classified curvature raster values; the road is classified road raster values; SPI is classified SPI raster values; Slope angle is classified slope raster values; Geology b ; Aspect b are logistic regression coefficient values listed in Table