Introduction

Landslides are major natural geological hazards and, each year, is responsible for enormous property damage and both direct and indirect costs. Malaysia experiences frequent landslides, with the most recent occurring in 2000, 2001, 2004, 2007, and 2008. They often result in significant damage to people and property. In Penang Island, much damage was caused by the torrential rainfall causing landslides and mudslides, and, because there was little effort to assess or predict the event, damage was extensive. Through scientific analysis of landslides, we can assess and predict landslide-hazardous areas and, by allowing proper preparation, decrease landslide damage. In order to achieve this, landslide-hazard analysis techniques were cross-applied and verified using a fuzzy membership model.

There have been many studies carried out on landslide-hazard evaluation using GIS; for example, Guzzetti et al. (1999) summarized many landslide-hazard evaluation studies. Recently, there have been studies on landslide-hazard evaluation using GIS, and many of these studies have applied probabilistic models (Akgun et al. 2007; Dahal et al. 2007; Clerici et al. 2006; Cevik and Topal 2003; Rowbotham and Dudycha 1998; Jibson et al. 2000; Luzi et al. 2000; Parise and Jibson 2000; Baeza and Corominas 2001; Clerici et al. 2002; Donati and Turrini 2002; Zhou et al. 2002; Lee et al. 2004b). One of the statistical models available, the logistic regression models, has also been applied to landslide-hazard mapping (Tunusluoglu et al. 2007; Lamelas et al. 2008; Wang and Sassa 2005; Süzen and Doyuran 2004; Dai and Lee 2002; Ohlmacher and Davis 2003; Mansor et al. 2007; Pradhan et al. 2006, 2008), as has the geotechnical model and the safety factor model (Gokceoglu et al. 2000; Romeo 2000; Refice and Capolongo 2002; Carro et al. 2003; Shou and Wang 2003; Zhou et al. 2003, Lee 2007, Lee and Pradhan 2007). As a new approach to landslide-hazard evaluation using GIS, data mining, safety factor, fuzzy logic and artificial neural network models have been applied (Gomez and Kavzoglu, 2005; Xie et al. 2004; Ercanoglu and Gokceoglu 2002; Pistocchi et al. 2002; Lee et al. 2003a, b, 2004a; Tangestani 2004; Lee and Pradhan 2006; Pradhan and Lee 2007, 2008 ; Catani et al. 2005).

Study area

Penang Island, which has suffered much landslide damage following heavy rains, was selected as a suitable pilot area to evaluate landslide-hazard analysis using the fuzzy operator model (Fig. 1). Penang is one of the 13 states of the Federal Territory of Malaysia and is located on the Northwest coast of the Malaysia peninsula. It is bounded to north and east by the state of Kedah, to south by the state of Perak, and to west by the Straits of Malacca and Sumatra (Indonesia). Penang consists of the island of Penang, and a coastal strip on the mainland, known as Province Wellesley. The island covers an area of 285 km2, and is separated from the mainland by a channel. The study area is located approximately between latitudes 5°15′ N to 5°30′ N and longitudes 100°10′E to 100°20′E. The land-use in the study area is mainly peat swamp forest, plantation forest, inland forest, scrub, grassland, and ex-mining area. The slope of the area ranges from 25° to as much as 87°. The relief of the study area varies from 0–420 m above mean sea level. Based on the Malaysian Meteorological Department, the temperature of the northern part of Penang ranges between 29°C and 32°C and mean relative humidity varies between 65% and 70%. The highest temperature is during April to June while the relative humidity is lowest in June, July and September. The rainfall of about 58.6 mm to 240 mm per month is recorded in the study area (at the Bayan Lepas weather station provided by the Malaysian Meteorological Services Department). GPS data for landslide locations has been collected for various parts of Penang Island in the month of July and October 2006. There were about 21 active landslides recorded from the field survey and these were used to verify the model output.

Fig. 1
figure 1

The study area with landslide locations

Spatial database creation

To apply the fuzzy approach model, a spatial database that considers landslide-related factors was designed and constructed. These data are available in Malaysia either as paper or as digital maps. The spatial database constructed is listed in Table 1. Landslide occurrence areas were detected in the Penang area, Malaysia by interpretation of aerial photographs and field surveys. A landslide map was prepared from aerial photographs, in combination with the GIS, and this was used to evaluate the frequency and distribution of shallow landslides in the area. These landslides were detected from aerial photographs by interpreting breaks in the forest canopy, bare soil, and other typical geomorphic characteristics of landslide scars. A total of 463 landslides were mapped within 285 km2 to assemble a database to assess the number of landslides in the study area. The inventory map was prepared by a structural geologist with a profound knowledge on aerial photo interpretation. These landslides were vectorized from the aerial photographs and the location of the scar extent was transformed into a grid database with 10 × 10-m cell size. Topography and lithology databases were constructed and lineament, land cover and vegetation index value extracted from Landsat TM satellite image for the analysis. Then, the calculated and extracted factors were converted to a 10 m × 10 m grid (ARC/INFO GRID type). The lineament and land cover were detected from satellite images such as Landsat TM (Thematic Mapper) images. There were ten landslide-inducing factors considered in calculating the fuzzy membership function. These factors were transformed into a vector-type spatial database using the GIS. For the DEM creation, 10-m interval contours and survey base points showing the elevation values were extracted from the 1:25,000-scale topographic maps and triangulated irregular network (TIN) was made using the elevation value. A DEM (digital elevation model) was made using the inverse distance weighting (IDW) interpolation method with 10 m resolution. Using this DEM, slope angle, slope aspect, and slope curvature were calculated. In the present study, substantial attention has been given for slope conditions. Slope configuration and steepness plays an important role in conjunction with lithology. Maps have been produced showing slope steepness; the slope varies from 0 to 45° in plain areas to near-vertical cliffs to greater than 45° in the steep areas. In the case of the aspect map, there are eight directions shown for the different direction of slope. In the case of the curvature, negative curvatures represent concave, zero curvature represent flat and positive curvatures represents convex. The curvature map was prepared using the avenue routine in ArcView 3.2. In addition, the distance from drainage was calculated using the topographic database. The drainage buffer was calculated at 100-m intervals and classified into ten equal area classes. The lithology map is prepared from a 1:63,300-scale geological map. A structural geologist interpreted the Spot 5 image by photo interpretation and detected the lineaments and the distance from lineament is calculated based on the Euclidean distance method in ArcView 3.2. The lineament buffer was calculated in 100-m intervals and classified into ten equal area classes. Land-use map was prepared using Landsat TM image (30 m spatial resolution) using unsupervised classification (ISODATA) method and field survey. There were 11 land-use classes identified, such as urban, water, forest, agriculture, and barren area. Finally, the normalized difference vegetation index (NDVI) map was generated from SPOT 5 (2.5 m spatial resolution) satellite images. The NDVI value was calculated using the formula \( {\text{NDVI}} = {{\left( {{\text{IR}} - R} \right)} \mathord{\left/ {\vphantom {{\left( {{\text{IR}} - R} \right)} {\left( {{\text{IR}} + R} \right)}}} \right. } {\left( {{\text{IR}} + R} \right)}} \), where IR is the energy reflected in the infrared portion of the electromagnetic spectrum, and R is the energy reflected in the red portion of the electromagnetic spectrum. The NDVI is useful in delineating vegetation. The soil map was obtained from existing hard copy soil map (1:100,000) from Department of Soil, Malaysia and is the only existing soil map for the study area. There are about 14 types of different soil series identified and used in the analysis. The precipitation map was prepared using the last 20 years of historical rainfall data. In the study area, there are only two rain-gauge stations located. So, data used in this study is collected from these stations and a statistical distribution of the accumulated average precipitation was prepared in GIS.

Table 1 Thematic data layer of study area

Using the factors and the detected landslide, the relationships were calculated using the frequency ratio, one of the probabilistic models. Then, the fuzzy membership values were also calculated using the frequency ratio. The fuzzy membership values were combined using the fuzzy and, fuzzy or, fuzzy algebraic product, fuzzy algebraic sum, and fuzzy gamma operators (13 cases) for landslide-hazard mapping. Finally, the map was verified by comparing with existing landslide locations for calculating prediction accuracy.

Fuzzy operator model was applied using the database and landslide-hazard map was created. Finally, the map was verified and compared using known landslide locations for quantitative verification. In the study, Geographic Information System (GIS) software, ArcView 3.2, and ArcGIS 9.0 version software packages were used as the basic analysis tools for spatial management and data manipulation. Overall, the Penang data set comprised 2,493 rows by 1,887 columns, for a total cell number of 4,704,291. Landslides had occurred in 463 of these cells.

Methodology

The fuzzy set theory introduced by Zadeh (1965) is one of the tools used to handle the complex problems. Therefore, the fuzzy set theory has been commonly used for many scientific studies in different disciplines. The idea of fuzzy logic is to consider the spatial objects on a map as members of a set. In the classical set theory, an object is a member of a set if it has a membership value of 1, or is not a member if it has a membership value of 0. In the fuzzy set theory, membership can take on any value between 0 and 1 reflecting the degree of certainty of membership. The fuzzy set theory employs the idea of a membership function that expresses the degree of membership with respect to some attribute of interest.

With maps, generally, the attribute of interest is measured over discrete intervals, and the membership function can be expressed as a table relating map classes to membership values. Fuzzy logic is attractive because it is straightforward to understand and implement. It can be used with data from any measurement scale and the weighing of evidence is controlled entirely by the expert. The fuzzy logic method allows for more flexible combinations of weighted maps, and could be readily implemented with a GIS modeling language. This is different from data-driven approaches such as weights of evidence or logistic regression, which use the locations of known objects such as landslides to estimate weights or coefficients. The idea of using fuzzy logic in landslide-hazard mapping is to consider the spatial objects on a map as members of a set. For example, the spatial objects could be areas on an evidence map and the set defined as ‘areas hazardous to landslide’. Fuzzy membership values must lie in the range (0, 1), but there are no practical constraints on the choice of the fuzzy membership values. Values are chosen to reflect the degree of membership of a set, based on subjective judgment. Given two or more maps with fuzzy membership functions for the same set, a variety of operators can be employed to combine the membership values.

Zimmerman (1996) discussed a variety of combination rules. Bonham-Carter (1994) discussed five operators, namely the fuzzy and, fuzzy or, fuzzy algebraic product, fuzzy algebraic sum, and fuzzy gamma operator. This study uses the five fuzzy operators for combining the fuzzy membership functions.

The fuzzy and is equivalent to a Boolean AND (logical intersection) operation on classical set values of (1). It is defined as:

$$ \mu_{\text{combination}} {\text{ = MIN }}\left( {\mu_{\text{A, }} \mu_{\text{B, }} \mu_{{{\text{C, }} \ldots }} } \right), $$
(1)

where μ combination is the calculated fuzzy membership function, μ A is the membership value for map A at a particular location and μ B is the value for map B, and so on.

The fuzzy or is like the Boolean OR (logical union in that the output membership values are controlled by the maximum values of any of the input maps. The fuzzy or is defined as:

$$ \mu_{\text{combination}} = {\text{MAX }}\left( {\mu_{\text{A, }} \mu_{\text{B, }} \mu_{{{\text{C, }} \ldots }} } \right). $$
(2)

The fuzzy algebraic product is defined as:

$$ \mu_{\text{combination}} = \prod\limits_{i = 1}^n {\mu_I, } $$
(3)

where μ i is the fuzzy membership function for the i-th map, and i = 1, 2, …, n maps are to be combined.

The fuzzy algebraic sum is complementary to the fuzzy algebraic product, being defined as:

$$ \mu _{{{\text{combination}}}} = 1 - {\prod\limits_{i = 1}^n {{\left( {1 - \mu _{i} } \right)}} }. $$
(4)

The gamma operation is defined in terms of the fuzzy algebraic product and the fuzzy algebraic sum by:

$$ \mu_{\text{combination}} = \left( {\text{Fuzzy algebraic sum}} \right)^{{\lambda * }} \left( {\text{Fuzzy algebraic product}} \right)^{{1 - \lambda }}, $$
(5)

where λ is a parameter chosen in the range (0,1), and the fuzzy algebraic sum and fuzzy algebraic product are calculated using Eqs. 3 and 4, respectively. In the fuzzy gamma operation, when λ is 1, the combination is the same as the fuzzy algebraic sum, and when λ is 0 the combination equals the fuzzy algebraic product. Judicious choice of λ produces output values that ensure a flexible compromise between the ‘increase’ tendencies of the fuzzy algebraic sum and the ‘decrease’ effects of the fuzzy algebraic product.

Like the membership function, the frequency ratio was calculated. The frequency ratio is shown in Table 2 for all factors. The spatial relationships between the landslide location and each landslide-related factor were analyzed by using the probability model–frequency ratio. The frequency ratio, a ratio between the occurrence and absence of landslides in each cell, was calculated for each factor’s type or range that had been identified as significant with respect to causing landslides. An area ratio for each factor’s type or range to the total area was calculated. Finally, frequency ratios for each factor’s type or range were calculated by dividing the landslide occurrence ratio by the area ratio. If the ratio is greater than 1, the relationship between landslides and the factors is higher and, if the ratio is less than 1, the relationship between landslide and each factor’s type or range is lower. Then, the frequency ratio was normalized between 0.00 and 1.00 to create the fuzzy membership value.

Table 2 Spatial relationships between each factor and landslide and fuzzy membership values

Application of fuzzy logic to landslide-hazard mapping

The input factors were combined for assigning membership functions. Nine landslide causal factors (slope, aspect, curvature, distance from drainage, soil, distance from lineament, NDVI, precipitation, and land cover) were combined to generate the final hazard map using fuzzy operators such as fuzzy and, fuzzy or, fuzzy algebraic product, fuzzy algebraic sum, and fuzzy gamma operator. In the case of fuzzy gamma operator, the value of λ was set to 0.025, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, and 0.975 to detect its effect on the landslide-hazard map.

Using the fuzzy membership function (Table 2) and the fuzzy operator (from Eqs. 1 to 5), the landslide-hazard index (LHI) values were computed for the 17 cases including the 13 cases in which the gamma operator was used. The computed LHI values were mapped to allow interpretation such as that illustrated for example in Fig. 2. The values were classified into equal areas and grouped into five classes for visual interpretation. For example, in the case of applying the fuzzy and product, the minimum, mean, maximum and standard deviation values of each LHI are 0.00, 0112, 0.3281, and 0.0254, respectively. In the case of applying the fuzzy algebraic sum, the minimum, mean, maximum, and standard deviation values of each LHI are 0.0042, 0.0277, 0.0513, and 0.0078, respectively. In the case of applying the gamma operator (λ = 0.975), the minimum, mean, maximum, and standard deviation values of each LHI are 0.000, 0.3003, 1.7448, and 0.5046, respectively. Also, in the case of applying the gamma operator (λ = 0.8), the minimum, mean, maximum, and standard deviation values of each LHI are 0.000, 0.0061, 0.00092, and 0.00117, respectively.

Fig. 2
figure 2figure 2figure 2

Landslide susceptibility maps using various fuzzy operators. a Application of fuzzy algebraic “sum” operator. b Application of fuzzy algebraic “and” operator. c Application of fuzzy gamma (λ = 0.8) operator. d Application of fuzzy gamma (λ = 0.9) operator. e Application of fuzzy gamma (λ = 0.975) operator

Verification of the fuzzy logic model and hazard maps

The landslide-hazard analysis results were verified using the existing landslide locations in the study area. The verification method was performed by comparison of existing landslide data and newly recorded landslide location data collected from field of the study area. The comparison results are shown in Fig. 3 as a line graph, which illustrates how well the landslide-hazard maps of nine cases match with respect to the landslides used in constructing those landslide-hazard maps. To obtain the data for Fig. 3, relative ranks of landslide-hazard map and landslide occurrence were compared for each case. For this aim, the probabilities were divided into classes of accumulated area ratio % (X-axis) according to the landslide-hazard index value (Y-axis).

Fig. 3
figure 3

Illustration of cumulative frequency diagram showing landslide susceptibility index rank (X-axis) occurring in cumulative percent of landslide occurrence (Y-axis)

For example, when applying fuzzy algebraic sum, the 90–100% (10%) class with the highest probability of a landslide contains 54% and the 80%–100% class (20%) contains 45% of the landslides of study area. In the case of applying the fuzzy and product, the 90%–100% (10%) class with the highest probability of a landslide contains 45% and the 80%–100% class (20%) contains 64% of the landslides of study area. In the case of the gamma operator (λ = 0.975), the 90%–100% (10%) class with the highest possibility of a landslide contains 58% and the 80–100% class (20%) contains 68% of the landslides of study area. In the case of applying the gamma operator (λ = 0.8), the 90–100% (10%) class with the highest possibility of a landslide contains 53% and 80–100% class (20%) contains 65% of the landslides of study area.

To compare the results quantitatively, the areas under the curve were recalculated taking the total area as 1, which means perfect prediction accuracy. So, the area under a curve can be used to assess the prediction accuracy qualitatively for landslide-hazard mapping. The area under the curve is shown in Table 3. For example, in the case of applying fuzzy algebraic and, the area ratio was 0.7949 and we could say that the prediction accuracy is 79.49%. In the case of applying fuzzy algebraic sum, the area ratio was 0.6477 and we could say that the prediction accuracy is 64.77%. In the case of applying the gamma operator (λ = 0.975), the area ratio was 0.7914 and the prediction accuracy is 79.14%.

Table 3 Verification results using area under curve (AUC)

Discussion and conclusions

Different fuzzy operators and different λ values for the gamma operation were tested on the input fuzzy membership functions to generate the most reliable landslide-hazard map. The membership values assigned to each evidence map also play an important role in the final results. The fuzzy operators used in the first or further steps of analyses also affect the possibilities obtained in the final hazard map.

After verification, among the 17 cases, the case of applying the gamma operator (λ = 0.8), showed the best accuracy (80.26%), whereas the fuzzy algebraic sum (64.77%) and fuzzy or (56.86%) operators showed the worst accuracy. In the case of applying the gamma operator with different λ value, the prediction accuracy had a similar value, between 79.14% and 80.26%. Generally, the verification results showed satisfactory agreement between the hazard map and the existing data from landslide locations. The effect of choosing different values of gamma (between 0 and 1) is not large. Because the landslide-hazard maps using different values of gamma (Fig. 2) are very similar and the prediction accuracy after verification is also very similar.

In the study, the data-derived model (frequency ratio) and the knowledge-derived model (fuzzy logic) were combined. As a result, the combined data- and knowledge-derived model is useful for landslide-hazard mapping considering the prediction accuracy. Decision-making under uncertainty is closely related to hazard analysis. Landslide-hazard map will help for decision-making for planners. These decisions are usually in the form of technical countermeasures, regulatory management or combinations of the two. Classic examples of regulatory management are zoning maps which, for instance, exclude some areas from habitation. Regulatory management is often quite intricate in prescribing different permit procedures which may include detailed evaluations and additional exploration or even go so far to prescribing particular slope designs (e.g., slope grades). The latter is actually a combination of regulatory and technical management. Technical mitigating measures range from a variety of stabilizing measures to protective measures such as rock-fall galleries to warning devices. One of the most important steps of developing a hazard-mitigation plan is assessing risks, or estimating potential losses to the people and properties within the landslide prone area.