Introduction

Landslide inventories provide a fundamental data source for landslide susceptibility, hazard, and risk assessment (Petschko et al. 2016). Landslide inventory maps are produced by conventional (consolidated) and innovative (remote sensing) methods (Guzzetti et al. 2012). Broad overviews concerning landslide investigations can be found in Guzzetti et al. (2012) and Scaioni et al. (2014). Among traditional methods, visual interpretations of stereoscopic aerial photography and geomorphological field reconnaissance are widely used. However, conventional map-generating techniques require expert knowledge, are highly subjective, and have limited reproducibility (Dou et al. 2015). In contrast, semi-automated or automated approaches exploiting remote sensing (RS) data can overcome such issues (Dou et al. 2015).

Today, research in the landslide recognition field is mainly focused on three areas: (1) deep exploration of high-resolution RS data to reveal landslide-related information, (2) the automation of landslide identification and feature extraction from RS data, and (3) the integration of diverse RS data. Recently, considerable efforts have been exerted in research on automatic landslide mapping (ALM) to exclude subjectivism, reduce time, and increase efficiency in landslide detection. This task is challenging but also complex (Scaioni et al. 2014). In the last few years, several studies have investigated this issue to develop more or less automated remote sensing techniques for ALM (Stumpf et al. 2017). Automatic techniques include analysis of RS data, such as optical images (Chen et al. 2017; Dou et al. 2015; Kurtz et al. 2014), synthetic aperture radar (SAR) data (Del Ventisette et al. 2014; Wasowski and Bovenga 2014), and light detection and ranging (LiDAR) digital terrain models (DTMs) (Leshchinsky et al. 2015; Lin et al. 2013b; Tarolli et al. 2012; Van Den Eeckhaut et al. 2012). The availability of high-resolution (HS) optical images (spaceborne, airborne, and terrestrial) affords more accurate and efficient landslide mapping than ever before (Li et al. 2016). Studies on the analysis of spectral data are mainly focused on mapping landslides resulting in the disappearance of vegetation cover (Stumpf et al. 2017) or seasonal and long-term changes in land use (Guzzetti et al. 2012). Unfortunately, such methods are ineffective in the case of slow-moving landslides in vegetated and agricultural areas (Van Den Eeckhaut et al. 2012) and also in cloudy areas (Li et al. 2016). In contrast, interferometric synthetic aperture radar (InSAR) constitutes the basis for a suitable technique to detect and monitor slow-moving landslides (Tofani et al. 2013). This technique is cloud independent but is limited to areas without dense vegetation cover, with favourable slope exposition and with substantial movements along the line of sight of the satellite (Wasowski and Bovenga 2014). Another promising technology is LiDAR owing to its multiple-echo or full-waveform capability, which allows easy filtering of vegetation and other non-ground objects, as well as the provision of detailed topography (Tarolli 2014). Hence, LiDAR is extremely helpful in landslide detection in areas covered by vegetation—even dense—where other techniques cannot be applied (Lin et al. 2013a). Jaboyedoff et al. (2012) provide a comprehensive overview of laser scanning applications in landslide investigations.

Looking at the ALM approaches based on RS data, we can discern two main groups: pixel-based (PBA) and object-oriented approaches (OOA). Generally, PBA relies solely on the spectral characteristics of the analysed image. This makes PBA more user-friendly. However, the main limitation of PBA concerns the speckled appearance of the classification results (Keyport et al. 2018) and the amalgamation of landslides (Marc and Hovius 2015). OOA integrates segments into meaningful objects using user-defined rules (Feizizadeh et al. 2017). These rules can be defined by statistics, shapes, texture, and contextual information (mutual relationships of image objects), which can often not be applied in PBA (Li et al. 2016). However, OOA is more site-specific and contains too many classification steps to be transferred easily to other regions (Van Den Eeckhaut et al. 2012).

In both approaches, various widely used DTM derivatives have been explored for landslide detection. In some studies, certain morphometric parameters have been used exclusively to detect landslides (Chen et al. 2014; Lin et al. 2013b; Van Den Eeckhaut et al. 2012) and in other studies, they have sometimes been integrated with other RS data (Chen et al. 2017; Kurtz et al. 2014; Mezaal et al. 2017). McKean and Roering (2004) applied local surface roughness, calculating variability in slope and aspect and the two-dimensional topographic curvature through the Laplacian operator to detect and characterize landslides. Also, Glenn et al. (2006) used local surface roughness delivered by LiDAR-DTM to differentiate diverse morphological components within isolated landslides. The two studies present differences in surface roughness throughout the diverse parts of a landslide. Van Den Eeckhaut et al. (2012) applied slope, plan curvature, roughness, openness, and multiple flow direction to detect a landslide using only DTM and OOA. Tarolli et al. (2012) used landform curvature for automatic detection of landslide crowns. Lin et al. (2013b) detected large-scale landslides using a shading map with aspects of different azimuth and aerial photos in the Namasha-Liuoguey area in Taiwan. Kurtz et al. (2014) used slope and curvature together with multiresolution images and a top-down hierarchical framework to extract landslides. Chen et al. (2014) used PBA, applying random forest and DTM-delivered layers such as mean aspect, DTM and slope textures, based on four textural directions. Shortly afterwards, Li et al. (2015) applied DTM, slope, aspect and surface roughness, and their textures and filtered modifications to identify forested landslides using machine learning and OOA. Still more recently, Chen et al. (2017) applied curvature, hillshade, roughness, flow direction, and slope simultaneously with satellite images and OOA to map a landslide in the Three Gorges Reservoir in China. Mezaal et al. (2017) applied 39 features in calculating the grey-level co-occurrence matrix (GLCM) from hillshade, height (normalized digital surface model nDSM), slope, and aspect to optimize a neural network architecture for ALM.

An important aspect in the effectiveness of ALM is the selection of features. Recent papers have demonstrated that the combination of many DTM derivatives with sophisticated machine learning classifiers (support vector machine or random forest) is computationally time-consuming (Chen et al. 2014; Mezaal et al. 2017) and can cause over-fitting issues (Chen et al. 2014). The best classification results have been achieved by selecting the most relevant features (Danneels et al. 2007; Kursa and Rudnicki 2010; Li et al. 2015; Mezaal et al. 2017; Pawłuszek and Borkowski 2017a; Stumpf and Kerle 2011).

Another very important aspect affecting the performance of ALM is scale. Scale is considered an effect of DTM resolution and the size of the window-moving calculation (also called the kernel size) of these topographical parameters. The selection of the appropriate scale is necessary to achieve high performance in landslide mapping (Paudel et al. 2016). Using a coarser DTM resolution, the topographic representation can be too smooth to detect a landslide or its features. Keijsers et al. (2011) concluded that landslide prediction is better with fine resolution for DTMs because this is free of the smoothing effect. In contrast, Tarolli and Tarboton (2006) found that the optimal DTM resolution for the detection of most likely landslide initiation points (in the case of shallow landslides) is 10 m. However, Tarolli et al. (2012) tested the effectiveness of different landform curvature maps with different smoothing factors for feature extraction. They analysed different kernel sizes to calculate curvature from 0.5 m DTM and found that curvature calculation is strongly scale dependent, and a 21 × 21 kernel size (10.5 m wide) was the most suitable scale for the extraction of landslide crowns and bank erosion in their specific study area.

To summarize, the accuracy of ALM is affected by several aspects: the selection of appropriate morphological indicators and scale (DTM resolution and kernel size for the calculation of DTM derivatives). Also, the classification method is critical. Although different scholars have discussed such issues, there is still a lack of systematic studies on the aspects influencing sensitivity in the context of ALM accuracy.

The main objective of this study is to perform a detailed sensitivity analysis of ALM, bringing together all the most important surface derivatives and morphometric parameters, the concept of scale (cell size, but also kernel size), and classification methods. This broad analysis, which contained almost 500 experimental tests, represents the challenging aspect of our research. In detail, 17 topographic parameters were calculated and combined, examining the optimal scale (DTM resolution and kernel size) for ALM. The performance parameters were then analysed through the feed-forward neural network (FFNN) and support vector machine (SVM) classification methods.

General settings of the study area

This study focused on a selected region of the Flysch Carpathian Mountains in Poland, known for its frequent occurrence of landslides. As part of the Landslide Counteracting Framework (SOPO) project (Bąk et al. 2011; Borkowski et al. 2011), approximately 60,000 landslides have been inventoried in the Polish part of the Carpathian mountains, 13% of which contained infrastructural elements in their area, such as buildings, roads, or transmission lines (Marciniec and Zimnal 2015). The study area covers an area of 26.3 km2. It is located in southern Poland within a zone comprising 49° 43′ to 49° 46′ N latitude and 20° 38′ to 20° 43′ E longitude in the Łososina Dolna municipality (Fig. 1). Most of the municipality area comprises agricultural land (42%), while forests account for 34% of the total area (Kroh et al. 2014). The landslide activity located in the study area is mostly associated with hydrogeological conditions, controlled by the fluctuation of the water level in the Różnów Lake (Borkowski et al. 2011). Several landslides have been covered by forest and therefore their identification is very challenging. Hence, optical data are rendered useless and an inappropriate data source for ALM.

Fig. 1
figure 1

Location of the study area. Shaded relief map derived by 1 m DTM

Data

Landslide inventory map (SOPO system)

The Polish Geological Institute has created a “Landslide Counteracting System” called SOPO (Bąk et al. 2011; Borkowski et al. 2011). One of the aims of The SOPO project was to produce detailed evidence in the form of a digital database of all recently active and inactive landslides in Poland (Perski et al. 2010). Field investigations during the SOPO project show 372 landslides with a total area of 6.72 km2 within the study site. This means that the landslide areas cover more than 25% of the study area. The minimum, maximum, and average landslide sizes are 0.05, 30, and 1.7 ha, respectively. Within the study area, landslides present various activity states, such as active, periodically active (active over the past 50 years), and dormant. They usually occur on slopes above rivers and creeks. Following the updated Varnes classification (Hungr et al. 2014), the landslides investigated are of types 11, 12, and 14—on clay/silt rotational, planar, and compound slides (Kroh 2016). Fieldwork identifying and mapping landslides using the framework of the SOPO system were performed from October 2010 to June 2011 (Gorczyca and Wrońska-Wałach 2011).

LiDAR data

LiDAR data were gathered within the System of the Country’s Protection called ISOK. ISOK is the informatics system which provides comprehensive knowledge and information on water management in Poland (http://www.isok.gov.pl). Within this project, airborne laser scanning (ALS) was performed for almost the whole country. ALS data were captured within the ISOK project using the Rigel LiteMapper 6800i System based on the Q680i laser scanner. The average point density in the study area is 4–6 points/m2, and the estimated root mean squared error (RMSE) for the height component is about 0.15 m. Pawłuszek et al. (2014) assessed the accuracy of this point cloud, finding that it varies for different types of land use from 10 cm for roads to around 20 cm for forests. To separate points representing bare earth, TerrScan software was used. The filtering method implemented in TerraScan is based on a local, adaptive triangular irregular network model introduced by Axelsson (2000). Subsequently, LiDAR-based DTM was used to generate topographical layers to facilitate the identification of landslide areas.

Methodology

A general overview of the methodology applied is given in Fig. 2. The natural neighbour interpolation method (Sibson 1981) was used to produce a 1-m DTM, already proven to be an appropriate interpolator for geomorphologic analysis (Pirotti and Tarolli 2010). The DTM with 1 m resolution was used to generate other DTM layers and resolutions. Second, 20 different DTM derivatives were calculated for various DTM resolutions and kernel sizes.

Fig. 2
figure 2

Methodology flowchart

Third, based on DTM and specific DTM-delivered layers, double-layer MLC was performed. Stratified random sampling was employed, with 20% of total image pixels used for training and the remaining 80% used for accuracy assessment. Based on this, the relevance of each DTM-delivered layer was assessed. Moreover, based on the accuracy indices of an abundant number of classification tests, the sensitivity of specific landslide morphological layers for different DTM resolutions and different moving window sizes was also evaluated. Fig. 3 shows a graphical representation of the experimental tests performed for features sensitivity analysis as test cube.

Fig. 3
figure 3

Feature sensitivity assessment—experimental test cube. The colour intensity represents the performance (best performance corresponds to more intense colour) of DTM derivatives with reference to the kernel size and DTM resolution employed. The most intensive colour and black boundaries represent the best performance of the feature in reference to DTM resolution and kernel size

Eventually, the final automatic landslide mapping was performed by applying two strategies. The first strategy uses the most relevant features based on the kappa value (kappa-based feature selection). Feature layers with kappa values higher than 0.2 within each feature type (Appendix 1—features with bold red font) were used for final ALM (Appendix 2). This threshold was selected because it has been assumed that kappa values lower than 0.2 represent low agreement between reference data and classification data (Viera and Garrett 2005). The second strategy uses features calculated using the minimum window size (Appendix 3). These two strategies were executed to assess the impact of selected kernel size on the calculation of features when performing ALM with a composition of many different DTM-delivered layers. Moreover, within both strategies different maximum likelihood (ML), FFNN and SVM classification methods were applied to test performance in automatic landslide mapping using PBA. Figure 4 represents graphical representation of different experimental tests using different classifiers and some of their parameters.

Fig. 4
figure 4

Experimental cube. Graphical representation of classifier performance assessment. The colour intensity represents the performance (best performance corresponds to more intense colour) of classifiers with reference to DTM resolution and the parameters of certain classifiers. The most intensive colour and black boundaries represent the best performance of the classifier in reference to DTM resolution and some parameters (MLC—no parameters were tested)

Fine to coarse DTM generation

Pixel size is highly involved in the efficiency of mapping and its selection can be optimized, to a certain level, to satisfy both processing capabilities and the representation of spatial variability (Hengl 2006). It was concluded that no optimal pixel size exists, but rather a range of relevant resolutions. Although many papers have argued the influence of pixel size on the accuracy of specific modelling, the selection of pixel resolution is exceptionally based on the inherent spatial variability of the input data for any scientific justification, mostly drawing on information theory (Borkowski and Meier 1994; Hengl 2006). In contrast, some papers have demonstrated that the selection of the finest DTM resolution is not always the optimal choice (Pawłuszek et al. 2014; Mora et al. 2014; Penna et al. 2014; Tarolli and Tarboton 2006; Pawłuszek et al. 2017). The selection of an inappropriate spatial resolution for DTM may result in misjudgement of landslide identification or misinterpretation of landslide features or morphology (Mora et al. 2014). To examine, experimentally, the performance of automatic landslide mapping using PBA with reference to DTM resolution, DTM with grid cell sizes equal to 1, 2, 5, 10, 20, and 30 m were generated.

Generation of landslide morphological indicators (DTM derivatives)

All DTM-derived landslide topographic indicators were provided in raster format with the pixel size calculated according to the DTM resolution tested. The most popular and widely applied landslide topographic indicators were calculated independently for different resolutions of DTMs and kernel sizes. Some of these are presented in Fig. 5. Among these derivatives are the principal components of hillshades generated from eight different directions, linear aspects, flow direction, side exposure index, roughness index, curvature (Bolstad’s variant, minimum maximum and mean Evans’ variant), mean slope, standard deviation of the mean slope, topographic position index, openness, and standard deviation of aspect. Table 1 presents the main information, calculation patterns, and references of DTM derivatives used in this study.

Fig. 5
figure 5

DTM derivative examples. (a) TPI (b) openness (c) aspect (d) roughness (e) man slope (f) max curvature

Table 1 Description, calculation patterns, and references of DTM derivatives used

Feature selection methods

Kursa and Rudnicki (2010) reported that the selection of only the most appropriate features enhances the quality of landslide identification. The landslide topographic indicators were selected according to two strategies. Within the first strategy, all layers with a minimum kernel size were used. Within second strategy, all layers presenting the best kappa index were used (red bold font in Appendix 3). Kappa is an important index that measures agreement between classification or identification results and reference data. Appendix 1 presents feature performance in the case of ALM. The most intense colour of cubes (Fig. 3) represent the most relevant feature used for the final ALM. It allows to claim that results are not a product of guesswork (Viera and Garrett 2005). For the selection of the most relevant features, the assumption that kappa index has to be greater than 0.2 was used. This thresholds was selected according to the general kappa classification which announce that kappa lower than 0.2 represents slight agreement between classification results and ground truth data (Viera and Garrett 2005). Appendix 1 represents the accuracy parameters, which allow to diagnose the initial features and eliminate of the least important features.

Classification methods used

Many factors, including the spatial resolution of the data used, diverse data sources and data types, classification systems, and software availability, must be considered in selecting a classification method. Various classifiers have their own merits (Kotsiantis et al. 2007; Lu and Weng 2007). For instance, when sufficient training samples are available and the features are normally distributed, a parametric classifier such as ML may provide accurate results. However, when image data are anomalously distributed, neural networking may demonstrate better performance (Pal and Mather 2004). Many papers have demonstrated that non-parametric classifiers may provide better performance than parametric classifiers in complex landscapes.

It is very challenging to state definitively which classifier is the most appropriate for a specific study. Depending on the classifiers selected, various classification results may be acquired. Therefore, to provide a better assessment of the impact of each DTM resolution on ALM, three classification techniques were tested: ML, FFNN, and SVM. As previously mentioned, 20% of randomly selected points were used for training. A stratified random sampling approach was applied. The same training samples were used for all classification tests. Classification tests were performed for the six compositions created from the different landslide morphological indicators generated from six different DTM resolutions.

Maximum likelihood classification

ML is a supervised classification method which is determined by the Bayes theorem. It employs a discriminant function to assign pixels to user-defined classes with the maximum likelihood (Asmala 2012). ML may be the most widely applied parametric classifier in practice because of its robustness and the availability of software (Lu and Weng 2007). ML classification cannot be applied in the case of a unique band layer, and thus, all classification tests performed to assess the relevance of each landslide conditioning factor were undertaken for two layers: DTM and the topographic layer being analysed.

Feed-forward neural networking

Artificial neural network approaches have been widely adopted in recent years. Neural networks have several benefits, including their non-parametric nature, arbitrary decision boundary capability, easy adaptation to different data types and input structures, fuzzy output values, and generalization for use with multiple images (Lu and Weng 2007). However, the variation in the dimensionality of a dataset and the characteristics of training and testing sets may reduce the accuracy of image classification. After several trials, the following network architecture was selected: FFNN with one hidden layer and sigmoid transfer function.

Support vector machine

The SVM classification technique is also a non-parametric classifier, which is based on statistical learning theory, optimization algorithms and structural risk minimization theory; it has been effectively used in landslide susceptibility mapping (Pawłuszek and Borkowski 2017a; Pradhan 2013) and landslide identification (Li et al. 2015; Van Den Eeckhaut et al. 2012). According to Vapnik (1995), the main idea of SVM is to construct a hyperplane that separates the data set into discrete predefined classes based on created training samples (Mountrakis et al. 2011). The hyperplane refers to an optimal separation boundary plane to minimize misclassification. It is iteratively performed in the learning stage.

Accuracy assessment

The classification methods were implemented in ENVI 5.4 software. All results were validated using accuracy assessment. There are many methods in the literature for evaluating mapping accuracy, but thus far, there is no universal method (Congalton 1991; Dou et al. 2015). Nevertheless, studies that measure the agreement between two or more observers should include a statistic that takes into account the fact that observers will sometimes agree or disagree simply by chance (Viera and Garrett 2005). The kappa index is a statistic that is widely used for this objective and thus we adopt it as the most relevant index demonstrating the precision and robustness of classification. Therefore, the kappa (K) index, producer accuracy (PU), user accuracy (UA), and overall accuracy (OA) were calculated. The OA is determined as the sum of correctly classified pixels divided by the total pixel number. PU presents how many of the pixels in a specific class on the map are classified correctly. UA is calculated by total number of correctly classified pixels for a particular class and dividing it by the row total. The results obtained were assessed for accuracy based on the K, OA, and PU values (Appendices 1, 2, and 3)

Results and discussion

Performance of specific morphological indicators

To assess the significance of topographic features, separate ML classifications were performed for particular DTM derivatives. Appendix 1 presents the double-layer (DTM + specific DTM derivative) ML classification. Features with a kappa index greater than 0.2 are indicated in red and these layers were selected for final landslide identification, examples of the results of which are presented in Fig. 6 (ML classification) and Fig. 7 (SVM classification), with accuracy evaluations presented in Appendix 2. Based on Appendix 1, the most significant features seem to be the roughness index, openness, topographic position index, and curvature (Bolstad’s variant and minimum curvature). Flow direction presents a kappa index lower than 0.2, and thus, this layer was excluded from further landslide detection using the composition of topographic layers. For DTM resolutions of 20 and 30 m, the third principal component (PC) of hillshade, maximum curvature, and standard deviation of mean slope also exhibit low performance. Aspect and standard deviation of aspect indicate moderate performance in landslide detection. The kappa index for this layer of different DTM resolutions and maximum kernel sizes oscillates around 0.25. Figure 3 also represents the performance of DTM derivative layers. The intensity of colour of the cubes represents the performance of DTM derivatives with reference to the kernel size and DTM resolution employed.

Fig. 6
figure 6

Examples of automatic landslide mapping using MLC classification and a 1 m DTM, b 2 m DTM, c 5 m DTM, d 10 m DTM, e 20 m DTM, and f 30 m DTM

Fig. 7
figure 7

Examples of automatic landslide mapping using SVM classification and a 5 m DTM, b 10 m DTM, c 20 m DTM, and d 30 m DTM

Performance of DTM resolutions

To assess the sensitivity of topographic features with reference to DTM resolution, ML classification was performed for different DTM resolutions (Appendix 1). As can be observed, the performance of specific landslide indicators is higher if the DTM resolution is coarser. However, in almost all cases, the best feature performance is demonstrated for the 20 m resolution of DTM. As an exception, for the roughness index and openness, the best resolutions appear to be 30 and 10 m, respectively. The performance of maximum curvature is higher for a finer resolution of DTM. Based on the results obtained, it can be seen that some layers are more sensitive to the DTM resolution selected than others in the context considered in this study. For instance, the difference between minimum kappa and maximum kappa for mean curvature is around 0.04, while for the roughness index, it is 0.15. For many features, the difference in the kappa index can vary from 0.25 to almost 0.4, which shows substantial sensitivity to the DTM resolution selected. A kappa value of 0.25 can be observed for features calculated for the finest resolution of DTM. It can be concluded that the performance of the topographic indicators presented increases proportionally to the coarser resolution of DTM.

Performance of moving window sizes

To assess the sensitivity of topographic features with reference to kernel size, ML classification was performed for different kernel sizes (3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11, 13 × 13, 15 × 15) (Appendix 1). Some features demonstrate significant sensitivity to the kernel size; however, this is also linked to the DTM resolution used. In the case of almost all features, the sensitivity of the kernel size for the finest DTM resolution (1–2 m) is not significant. In many cases, the kappa index changes by around 0.03 and changes in OA do not exceed 4%. However, in the case of coarser DTM resolution, topographic layers are more sensitive to the kernel size used. For instance, observing the roughness index (rows 8 to 14 in Appendix 1), it can be seen that the performance of the roughness index increases proportionally to the coarser DTM resolution and to the finer kernel size. In the cases of 10, 20, and 30 m resolutions, the roughness index demonstrates the best performance for the minimum kernel size. Similar findings can be observed for mean curvature, mean slope, standard deviation of mean slope, and openness. However, the opposite is found for TPI and the standard deviation of aspect. These layers demonstrate better performance when the kernel size is bigger. In the case of curvature (Bolstadt’s variant), lower sensitivity to the kernel size used can be observed. Changes in the kappa index oscillate between 0.01 and 0.03 for the 30 m DTM resolution. This demonstrates that the selection of window size for such topographic features is not meaningful. The opposite can be observed for other features. In the case of coarser DTM resolution, the sensitivity of features to the kernel size increases, while in the case of finer resolution, it is also not significant.

Performance of feature selection methods

Appendices 2 and 3 present the accuracy assessment for ALM using various classification methods and DTM resolutions, respectively. The difference between these two strategies is the feature selection method. In Appendix 2, the results of ALM are presented for the topographic layer composition created from the most relevant topographic indicators, selected based on the kappa index (kappa-based feature selection method). In Appendix 3, the results of ALM are presented for topographic layer composition, for which features were calculated using minimum kernel size (minimum kernel size feature selection method). The accuracy assessment of these two strategies demonstrates that the classification results do not change significantly with reference to kernel size in relation to the topographic layers calculated. The kappa index does not change more than 0.05. However, this slight change was found in the case that a composition of 16 layers was used for classification. When only two layers are selected, the kernel size matters (DTM + roughness index), especially for coarser DTM resolutions (Appendix 1).

Performance of classification methods

Appendices 2 and 3 present also accuracy assessment for different classification methods and some of their parameters. ML classification, as an example of a parametric classifier, presents the lowest performance. In almost all cases, the best performance is presented by SVM classification; however, this classification method is time-consuming. Moreover, for the finest resolutions of DTM (1 and 2 m), this classification method cannot be executed, while FFNN and ML can. However, it can be observed that the accuracy increases proportionally to the coarser DTM resolutions. This could be caused by single pixels that are specific to finer DTM resolutions. FFNN demonstrates quite similar performance to SVM. Some of the valuable parameters of these classifiers were also tested to assess the sensitivity of the classifier to a number of parameters used. Some classifier parameters were also tested to assess the stability of the classifier. The experimental evaluations presented in Appendices 2 and 3 show that changing the kernel function (radial basis function vs polynomial) does not change the performance of classifiers to any great extent. Notably, more unstable results are provided by FFNN classification. The findings suggest that the number of iterations used for classification should be lower if the resolution of the DTM used is finer. For instance, for 500 iterations, the kappa indices for DTM 30 and 5 m are 0.47 and 0, respectively (Appendix 3). In contrast, for 100 iterations, the kappa indices for DTM 30 and 5 m are 0.41 and 0.40, respectively. Figure 5 represents also the performance of the classification methods used and some of their parameters. The intensity of the colour of cubes represents the performance of classifiers with reference to DTM resolution and the parameters of certain classifiers.

Optimal strategy for automatic landslide mapping

Based on the above discussion, the accuracy of ALM can be achieved with the agreement of a kappa index of around 0.5 and overall accuracy of around 80%. Applying a DTM resolution of 30 m, features with the best kappa (red bold font in Appendix 1) and FFNN classification present a kappa index of 0.5 and OA of 77%. In the case of SVM classification and the same DTM resolution and features, the kappa index is 0.55 and the OA is 81.4%. Similar results have also been demonstrated by other authors (Leshchinsky et al. 2015; Mezaal et al. 2017). Lower kappa values are likely related to under-prediction of landslides and over-prediction of non-landslide areas. Under-prediction is also particularly evident in the case of old denudated landslides, where the morphology has been changed by agricultural activities or physical processes. Over-prediction of landslides can be observed in river valley bottoms and anthropogenic forms, where the approach presented identified steep and rough sections of rivers or anthropogenic slopes as landslide areas. The best agreement (Kappa = 0.55) was achieved for the kappa-based feature selection method using SVM, with seven-degree polynomial kernel function and 30 m DTM resolution. The best accuracy for correctly classified landslide pixels (PA = 79%) is demonstrated by FFNN classification with 400 iterations and 30 m DTM and features created using the minimum kernel size. From the perspective of fast and easy-to-use rapid mapping with reference to after-event response, the SVM classification seems to be quite balanced. The selection of kernel function using the approach presented does not change the accuracy significantly and good agreement between results and reference data can been achieved. Moreover, the expert can ignore the kernel size of the calculation of diverse topographic indicators while performing the classification with many DTM derivatives (around 16). However, the selection of only one—but the most relevant—feature with the right kernel size can sometimes provide slightly better results in landslide detection than the composition of 16 DTM derivatives. This can be observed by comparing the results of ML classification for a DTM resolution of 5 m (Appendix 2) and the roughness index (Appendix 1).

Conclusions

This paper presents an extended sensitivity analysis of automatic landslide mapping (ALM) using a pixel-based approach. Analyses of different aspects influencing the pixel-based approach and its sensitivity are crucial for improving model performance and understanding their weight in the case of ALM. The sensitivity of 17 topographic parameters was investigated with reference to DTM resolution and kernel size, as follows: roughness and curvature (Bolstadt’s variant), as well as maximum curvature, minimum curvature, mean curvature, and mean slope (Evans 1979) and the standard deviation of mean slope, topographic position index, openness, and standard deviation of aspect. The optimal combinations (DTM resolution and kernel size) were found for all topographic parameters and then used for the final ALM. Moreover, this paper presents changes in the performance of final ALM based on the DTM resolution, kernel size, and classification method selected. In terms of classification methods, the study examined maximum likelihood (ML), feed-forward neural networking (FFNN), and the support vector machine (SVM). The results provide valuable guidelines for ALM using a pixel-based approach.

Specifically, based on the results, the study concludes that the most relevant features for landslide detection are the roughness index (Kappa = 0.39), openness (Kappa = 0.33), the topographic position index (Kappa = 0.34), curvature (Bolstad’s variant; Kappa = 0.34), minimum curvature (Kappa = 0.38), and mean slope (Kappa = 0.38). Moreover, the performance of the features presented primarily increases proportionally to coarser DTM resolutions; however, the peak of increasing accuracy is observed for a DTM resolution of 30 m. The features analysed in this study are more sensitive to the window size used, applying a coarse resolution of DTM, as can be observed from the minimum and maximum kappa values in Appendix 1. In the case of maximum likelihood (ML) classification, sometimes one topographic layer (roughness index) and DTM may provide better performance than a set of topographic indicators and ML classification. ML classification demonstrates lower performance in comparison to support vector machine (SVM) and feed-forward neural network (FFNN) classification. However, in the case of a coarser resolution of DTM, the difference in kappa values between the classifiers did not exceed 0.05. SVM demonstrated the best performance of classification in the case of coarser DTM resolution and appeared not to be very sensitive to the selection of kernel function. FFNN classification demonstrated the best performance for the finest resolution; however, the classifiers are very sensitive to the number of iterations used.

Having considering the aforementioned results, the use of PBA and an extended set of topographic variables delivered through coarser DTM resolution and machine learning classification methods (FFNN and SVM) allow effective ALM. This approach provides a potential geospatial solution for managing landslide hazards and conducting landslide risk assessments. The approach presented notably requires less input analysis and expertise than other approaches (e.g., object-oriented approaches). Nonetheless, this approach presents some limitations, especially in the case of the amalgamation of single landslides. This should be the subject of future research. Another substantive issue that should be investigated in further research is sensitivity analysis of ALM for different landslide types and sizes.