# A GIS-based logistic regression model in rock-fall susceptibility mapping along a mountainous road: Salavat Abad case study, Kurdistan, Iran

## Authors

- First Online:

- Received:
- Accepted:

DOI: 10.1007/s11069-012-0321-3

- Cite this article as:
- Shirzadi, A., Saro, L., Hyun Joo, O. et al. Nat Hazards (2012) 64: 1639. doi:10.1007/s11069-012-0321-3

## Abstract

This study describes the application of logistic regression to rock-fall susceptibility mapping along 11 km of a mountainous road on the Salavat Abad saddle, in southwest Kurdistan, Iran. To determine the factors influencing rock-falls, data layers of slope degree, slope aspect, slope curvature, elevation, distance to road, distance to fault, lithology, and land use were analyzed by logistic regression analysis. The results are shown as rock-fall susceptibility maps. A spatial database, which included 68 sites (34 rock-fall point cells with value of 1 and 34 no rock-fall point cells with value of 0) was developed and analyzed using a Geographic Information System, GIS. The results are shown as four classes of rock-fall susceptibility. In this study, distance to fault, lithology, slope curvature, slope degree, and distance to road were found to be the most important factors affecting rock-fall. It was concluded that about 76 % of the study area can be classified as having moderate and high susceptibility classes. Rock-fall point cells were used to verify results of the rock-fall susceptibility map using success curve rate and the area under the curve. The verification results showed that the area under the curve for rock-fall susceptibility map is 77.57 %. The results from this study demonstrated that the use of a logistic regression model within a GIS framework is useful and suitable for rock-fall susceptibility mapping. The rock-fall susceptibility map can be used to reduce susceptibility associated with rock-fall.

### Keywords

Rock-fall Susceptibility map Logistic regression Salavat Abad Kurdistan Iran## 1 Introduction

Studies of rock-falls are often based on field surveys, and susceptibility is estimated either by an empirical assessment of susceptibility to failure, or by the calculation of a safety factor derived from models of rock mechanics (e.g., Hoek and Bray 1981). Once a location is identified with rock-fall risk, the probability of the maximum travel distance and maximum energy of impact of rock-fall events at the location is normally assessed using computer simulations (e.g., Wu 1985; Kobayashi et al. 1990; Azzoni et al. 1995).

There are many different models for study on rock-fall events in three main categories: (1) empirical models based on relationships between topographical factors and the length of the run out of rock-falls (e.g., Keylock and Domaas 1999), (2) process-based models that describe or simulate the modes of motion of falling rocks over slope surfaces (e.g., Kirkby and Statham 1975; Statham 1976; Hungr and Evans 1988; Pfeiffer and Bowen 1989; Kobayashi et al. 1990), and (3) GIS-based models that are running within a GIS environment or they are raster-based models for which input data are provided by GIS analysis (e.g., Evans and Hungr 1993; Hegg and Kienholz 1995; Chau et al. 2004). Little work was done on rock-fall susceptibility mapping based on GIS (e.g., Carrara et al. 1995; Chung et al. 1995; Guzzetti et al. 1999; Suzen and Doyuran 2004a, b; Chau et al. 2004). Susceptibility maps are found to be very useful in estimating, managing, and mitigating mass movement susceptibility for a region (e.g., Corominas and Santacana 2003; Chung and Fabbri 2003; Sassa et al. 2004).

There are various methods to susceptibility mapping including the following: semi-qualitative methods similar to the analytical hierarchy process (AHP) (e.g., Barredol et al. 2000), bivariate statistical analysis (e.g., Kelarestaghi and Ahmadi 2009; Nandi and Shakoor 2009), the probability–frequency ratio model (e.g., Lee and Pradhan 2006), and multivariate regression methods such as logistic regression (e.g., Lee and Sambath 2006; Pradhan 2010; Su and Cui 2010; Choi et al. 2012). The recent and rapid increase in computing capacity has also allowed scientists to treat large sets of data, which is a crucial factor in applying multivariate statistical analysis. Multivariate procedures have long been employed for landslide susceptibility mapping (e.g., Reger 1979; Carrara et al. 1992; Gorseveski et al. 2000; Baeza and Corominas 2001; Lee and Min 2001; Ayenew and Barbieri 2005; Can et al. 2005; Chau and Chan 2005; Greco et al. 2007). Among the various susceptibility mapping methods, logistic regression (LR) presents certain advantages for studying the landslides of soil and/or weathered rocks (e.g., Gorseveski et al. 2000; Dai and Lee 2002; Chau et al. 2004; Ayalew and Yamagishi 2004; Lee and Sambath 2006; Akgun and Bulut 2007; Akgun et al. 2008; Lamelas et al. 2008). Since very few studies on rock-fall susceptibility mapping based on GIS and statistical analysis have been noticed (e.g., Chau et al. 2004); we decided to further study this topic. Thus, the aim of this research is to apply and assess a logistic regression model to generate rock-fall susceptibility map along a mountainous road in the western parts of Iran.

## 2 Materials and methods

### 2.1 Topographic and geologic setting of the study area

One step in generating a rock-fall distribution map was taken in the laboratory, and another one was taken in the field. Recognition of rock-falls on aerial photographs is difficult because they are located on steep slopes and artificial slopes and easily confused with man-made objects. Thus, they are not easily identified similarly to the landslides that have occurred at a scale larger than that of the falls. Therefore, the recognition of rock-falls in the study was done directly in the field. The collection of the rock-falls locations was recorded by the Transport Office of the Kurdistan province in 2006. Based on the discontinuity in the rock-falls, the geologists at the Transport Office of the Kurdistan province have recorded the central of each rock-fall slopes as one of the 34 rock-falls in the study area. When the 34 rock-falls were selected, 50 no-rock-falls locations were recorded by the same method as the 34 rock-falls were selected. Finally, 34 out of 50 no-rock-falls were randomly selected. The data in the report were verified using field surveying and slopes instability observations. The rock-falls that had occurred on the slopes had high-density cracks and joints so that on some of the slopes in the case study area, the dimensions of cracks and joints were comparatively large causing the rocks to fall downward the slope and be collected in its toe (Fig. 1).

### 2.2 Data and methods

Data layer of study area

Classification |
Sub-classification |
Extracted factor |
GIS data type |
Scale |
References |
---|---|---|---|---|---|

Basic map |
Rock-fall |
Rock-falls inventory |
Point coverage |
1:25,000 |
Field works |

Topography |
Slope gradient |
Grid |
20 m × 20 m |
Iran Cartographic Organization | |

Slope aspect |
Grid |
20 m × 20 m | |||

Slope curvature |
Grid |
20 m × 20 m | |||

Geology |
Lithology |
Polygon coverage |
1:100,000 |
Iran Geological Organization | |

Fault |
Line coverage | ||||

Land Sat ETM image |
Land use |
Grid |
20 m × 20 m |
Iran Space Agency |

### 2.3 Logistic multiple regression

*p*(event) is probability of an event occurring. In the present,

*p*(event) is the estimated probability of rock-fall occurrence. As

*Z*varies from −∞ to +∞, the probability varies from 0 to 1.

*Z*is the linear combination:

*Y*is probability of rock-fall occurrence,

*B*

_{ n }(\( i = 0,1, \ldots ,n \)) is the coefficient estimated from the sample data, n is the number of independent variables, and

*X*

_{ n }(\( i = 0,1, \ldots ,n \)) is the independent variables. In logistic multiple regression, a coding scheme should be selected for the categorical variables that by creating a new set of variables that correspond in some way to the original categories. The number of new variables required to present a categorical variables is one less than that of the number of categories. The coefficients of the logistic multiple regression models are estimated using the maximum-likelihood method. In other words, the coefficients that make the observed results most “likely” are selected. Since the relationship between the independent variables and the probability is nonlinear in the logistic multiple regression model, an iterative algorithm is necessary for parameter estimation. Logistic multiple regression modeling is intended to describe the likelihood of rock-fall occurrence on a regional scale and is very suitable for the assessment of slope instability, since the observed data consist of locations (points) or cells with a value of 0 (absence of rock-fall) or 1 (presence of rock-fall). This method allows a spatial distribution of probabilities or susceptibility values to calculate within the GIS environment.

## 3 Results

### 3.1 Rock-fall susceptibility modeling

List of independent variables used in logistic regression

Slope degree |
Symbol |
Slope aspect |
Symbol |
Elevation |
Symbol |
Land use |
Symbol |
---|---|---|---|---|---|---|---|

0–10 |
A1 |
Flat |
B1 |
1,699–1,800 |
C1 |
Garden |
D1 |

10–15 |
A2 |
North |
B2 |
1,800–1,900 |
C2 |
Garden and rang |
D2 |

15–20 |
A3 |
East north |
B3 |
1,900–2,000 |
C3 |
D3 | |

20–25 |
A4 |
East |
B4 |
2,000–2,100 |
C4 |
Rocky area |
D4 |

25–30 |
A5 |
East south |
B5 |
2,100–2,200 |
C5 |
Salavat Abad village |
D5 |

30–35 |
A6 |
South |
B6 |
2,200–2,300 |
C6 | ||

35–40 |
A7 |
West south |
B7 |
2,300–2,400 |
C7 |
Semi-density range | |

>40 |
A8 |
West |
B8 |
2,400–2,500 |
C8 | ||

West north |
B9 |

Lithology |
Symbol |
Distance to road |
Symbol |
Distance to fault |
Symbol |
Slope curvature |
Symbol |
---|---|---|---|---|---|---|---|

Basalt and andesite |
E1 |
0–100 |
F1 |
0–150 |
G1 |
Concave |
H1 |

E2 |
100–200 |
F2 |
150–300 |
G2 |
Straight (flat) |
H2 | |

Limestone |
E3 |
200–300 |
F3 |
300–450 |
G3 |
H3 | |

Conglomerate and shale |
300–400 |
F4 |
450–600 |
G4 |
Convex | ||

400–500 |
F5 |
>600 |
G5 | ||||

>500 |
F6 |

The coefficients, significance, and Exp (β) for logistic regression in this study

Independent parameter |
Class |
Coefficient |
Significant |
Exp (β) |
---|---|---|---|---|

Slope angle |
15–20° |
−2.705 |
0.033 |
0.067 |

Geology |
Limestone |
3.135 |
0.000 |
22.978 |

Distance to road |
100–200 m |
−2.832 |
0.027 |
0.059 |

Distance to fault |
450–600 m |
4.976 |
0.011 |
144.825 |

Slope curvature |
Convex |
−1.672 |
0.025 |
0.188 |

Constant |
−2.189 |
0.005 |
0.112 |

### 3.2 Susceptibility map and its reliability

*p*< 0.5 for the case of non-rock-fall, the prediction is viewed as successful. In validation, the accurate rate is 85.3 % for rock-fall group and 75.1 % for non-rock-fall group using error matrix method. The total accurate rate is 79.1 %, which is considered acceptable (Table 4).

Classification table and predicted percentage correct test

Observed |
Predicted |
Correction percentage (%) | |
---|---|---|---|

Absence of rock-fall (0) |
Presence of rock-fall (1) | ||

Absence of rock-fall (0) |
24 |
10 |
70.6 |

Presence of rock-fall (1) |
5 |
29 |
85.3 |

77.9 |

*R*

^{2}in linear regression, there are also correlation coefficients for logistic regression analysis and they are called Cox and Snell

*R*

^{2}(e.g., Cox and Snell 1989) and Nagelkerke

*R*

^{2}(e.g., Nagelkerk 1991). For the present model, they are 0.418 and 0.558, respectively (Table 5). The theoretical values of these coefficients are again from 0 to 1. Unlike the linear regression, these coefficients can be relatively small (this does not necessarily invalidate the model in the case of logistic regression) and there is again no universal standard of what value of Cox and Snell

*R*

^{2}and Nagelkerke

*R*

^{2}should be obtained in the regression to be acceptable, that

*R*

^{2}> 0.9 is normally considered as a good indicator of a reasonable fit.

Some statistics and map accuracy evaluation

Independent variable |
−2log likelihood (−2LL) |
Cox and Snell |
Nagelkerke |
(AUC %) |
---|---|---|---|---|

All variables |
57.445 |
0.418 |
0.558 |
77.57 |

Without slope degree |
59.904 |
0.397 |
0.529 |
71.80 |

Without slope curvature |
62.985 |
0.396 |
0.492 |
73.97 |

Without elevation |
57.445 |
0.418 |
0.558 |
73.11 |

Without distance to road |
72.230 |
0.277 |
0.369 |
66.17 |

Without distance to fault |
72.390 |
0.275 |
0.367 |
70.24 |

Without lithology |
62.167 |
0.376 |
0.502 |
72.93 |

Comparison of rock-fall occurrence and rock-fall susceptibility map using logistic regression method for the case study

Range of susceptibility map |
Average of susceptibility map |
Number of pixels |
Percentage of pixels (a, %) |
Number of rock-falls |
Percentage of rock-falls (b, %) |
b/a |
---|---|---|---|---|---|---|

0–0.228 |
0.266 |
15,782 |
33.74 |
4 |
11.764 |
0.348 |

0.228–0.380 |
0.341 |
15,021 |
32.11 |
3 |
8.823 |
0.274 |

0.380–0.458 |
0.419 |
4,247 |
9.08 |
3 |
8.823 |
0.971 |

0.458–0.535 |
0.496 |
10,563 |
22.58 |
21 |
61.764 |
2.73 |

0.535–0.613 |
0.574 |
671 |
1.43 |
2 |
5.882 |
4.11 |

0.613–0.690 |
– |
– |
– |
– |
– |
– |

0.690–0.768 |
– |
– |
– |
– |
– |
– |

0.768–0.845 |
– |
– |
– |
– |
– |
– |

0.845–0.923 |
– |
– |
– |
– |
– |
– |

0.923–1 |
0.961 |
497 |
1.06 |
1 |
2.944 |
2.77 |

Sum |
46,781 |
100 |
34 |
100 |

Rock-fall distribution in predicted rock-fall susceptible zone

Rock-fall susceptible zone |
%Area of predicted zone |
%Area of observed rock-fall per class |
SCAI (rock-fall density) |
---|---|---|---|

Very low |
42.12 |
14.72 |
2.8614 |

Low |
13.21 |
8.82 |
1.4977 |

Moderate |
38.15 |
64.67 |
0.5899 |

High |
6.52 |
11.79 |
0.5530 |

## 4 Discussion and conclusion

The aim of this study was to assess efficiency of logistic regression model for rock-fall susceptibility mapping along a mountainous road in the Kurdistan province, Iran. Rock-falls are natural phenomena that often have detrimental consequences. In susceptibility management, rock-fall susceptibility map can help to effectively prevent and manage susceptibility. Many qualitative and quantitative techniques are useful for analyzing the relationship between rock-falls and their affective parameters. In this research, we attempted to provide rock-fall susceptibility maps using the relationship between rock-fall locations and determining parameters. The logistic regression model was applied to study the impact of different parameters on rock-fall and susceptibility map of the area. The first results of the logistic regression were the model statistics and coefficients, which were useful to assess the accuracy of the regression function and the role of parameters on the presence or absence of landslides. The forward condition stepwise method was applied for statistical analyses. Slope aspect, land use, and elevation were considered to be not significant for predicting rock-fall and were excluded from the final model. For the study area, in particular, the most important effective factor is 450–600 m distance to fault with coefficient of 4.976 and 0.011. *p* Value for significance and Exp (β) equal to 144.825, which means that if distance to fault increases by one unit, the value for rock-fall occurrence will increase 144.825 in time. Other factors are geology (limestone), slope curvature (convex), slope angle (15–20°), and distance to road (100–200 m), respectively. Distance to fault and geology by having positive coefficient and high value for Exp (β) has more influence than other factors. The results of logistic regression model were validated using some kinds of validation strategies and were accepted. The results are in line with those of Chau et al. (2004) and Chau and Chan (2005) in their study. Considering coefficients estimated for the logistic regression (Table 3), the “closeness to roads” parameter was found to have the strongest relationship with rock-fall occurrence. Ayalew et al. (2005) have introduced “proximity to roads” parameter as the most important factor on landslide occurrence in Kakuda-Yahico, central Japan. They declared that most of the landslides located in the range 0–100 m from roads. Also, Lee and Sambath (2006), Greco et al. (2007) and Kelarestaghi and Ahmadi (2009) have emphasized adverse effect of road construction on landslide occurrence in their studies. Table 2 indicates that geological units belonging to the Cretaceous era and consisting mostly of limestone are more susceptible to rock-fall. Can et al. (2005) and Nefeslioglu et al. (2008) emphasized causative role of geological units on mass movements in their researches. Land use was excluded from the logistic regression model runs because land use units do not show much change in the study area and most of the rock-falls have occurred in rocky areas. In the Salavat Abad saddle, the Salavat Abad fault by a thrust mechanism led to the generation of joints and cracks of different sizes so that by continued physical weathering and thawing, they have caused rock-falls due to gravity toward the toe of the slope. The road in the region has been encompassed by this fault spiral form, and most of the rock-falls have occurred around it. Therefore, the presence of the fault is one of the most important effective factors for rock-falls occurrence in the study area. The removal of the slope aspect from the logistic regression model run is caused by the presence of the Salavat Abad fault. The role of the Salavat Abad fault was very predominating so that slope aspect did not show any statistical relationship with rock-falls: some of the rock-falls occurred on the west slope instead of on the east slope. The rock-fall susceptibility map was classified into four categories as follows: very low, low, moderate, and high. The rock-fall susceptibility map was assessed using the area under the curve (AUC) in ROC value (Fig. 7). This diagram shows the 77.57 % as the value of the area under the curve (AUC), which indicates that in the study area; the rock-fall susceptibility map is highly accurate.

To quantitatively compare the result, the areas under the curves (AUC) were individually recalculated for all factors. The results showed that with all effective factors in the logistic regression model, the AUC value rises to 77.57 %, and the values obtained for the other variables are as follows: 71.80 %, slope aspect, 73.97 %, slope curvature, 73.11 %, elevation above the sea level, 66.17 %, geology, 70.24 %, distance to road, and 72.93 %, distance to fault. So, all variables in the rock-fall susceptibility map appear effective, and the prediction of the susceptibility map was measured in this way. These results are in agreement with the results of Lee and Sambath (2006), Lee and Pradhan (2006, 2007), Oh et al. (2009), Jadda et al. (2009), Nandi and Shakoor (2009), Oh et al. (2009), Pradhan (2010), Pradhan and Lee (2010), Chauchan et al. (2010) and Oh and Pradhan (2011) to apply the success rate curve and AUC to investigate the reliability of the landslides susceptibility map. We have also found that 14.72, 8.82, 64.67, and 11.79 % of the area is located at very low, low, moderate, and high susceptible zones, respectively.

Table 6 indicates that rock-fall susceptibility and probability has good accordance with rock-falls because in the classes with the susceptibility and probability less than 0.5, the amount of b/a < 1, and in the classes with the susceptibility and probability more than 0.5, the amount of b/a > 1. This result is in line with the result of Chau and Chan in evaluating landslide susceptibility map using logistic regression.

The SCAI values given in Table 7 show that the map generated is adequate because the high and moderate susceptibility classes have very low SCAI values, whereas the SCAI values of the very low and low susceptibility classes are very high (Table 7).

In general, in the study area, in addition to natural parameters including “slope gradient,” “slope curvature,” “geology,” and “closeness to fault,” human activities have played a major role on rock-falls. The susceptibility map produced is here considered as acceptable as a basis for studies on mass movement risk management in the study area. If the susceptibility map was overlaid with a vulnerability map (an inventory of building, infrastructure, and other elements at risk and of their expected losses), an objective risk assessment could be achieved. The information derived from this map can help citizens, planners, and engineers to reduce losses caused by existing and future rock-falls by means of prevention and mitigation.

## Acknowledgments

We wish to thank Prof. Yamagishi Hiromitsu, Ehime University, Japan and Prof. Kam Tim Chau, the Hong Kong Polytechnic University of China, for their advices and suggestions on an earlier version of this manuscript. The authors also wish to thank the Transport Office of the Kurdistan province for the report of rock-falls location in the study area, and the University of Kurdistan, and the University of Mazandaran for their financial supports.

### Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.