Machine learning for the definition of landslide alert models: a case study in Campania region, Italy

Pota, Marco; Pecoraro, Gaetano; Rianna, Guido; Reder, Alfredo; Calvello, Michele; Esposito, Massimo

doi:10.1007/s44163-022-00033-5

Machine learning for the definition of landslide alert models: a case study in Campania region, Italy

Case Study
Open access
Published: 26 August 2022

Volume 2, article number 15, (2022)
Cite this article

Download PDF

You have full access to this open access article

Discover Artificial Intelligence Aims and scope Submit manuscript

Machine learning for the definition of landslide alert models: a case study in Campania region, Italy

Download PDF

Marco Pota¹,
Gaetano Pecoraro²,
Guido Rianna³,
Alfredo Reder³,
Michele Calvello² &
…
Massimo Esposito¹

2302 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Landslide early warning systems at regional scale are typically based on correlations between landslides that occurred in the past and rainfall monitoring data in order to identify trigger rainfall thresholds. Recently, the availability of large datasets of atmospheric measurements allows including additional variables, increasing the reliability of the models. However, conducting the analyses with traditional techniques can be quite complex and time-consuming. The purpose of this preliminary study is to demonstrate that machine learning techniques can be used to analyze monitoring data in order to select the most relevant variables for the triggering of shallow rainfall-induced landslides at regional scale. The models developed herein were tested in one of the alert zones defined by civil protection for the management of geo-hydrological risk in Campania region, Italy. Two data sources were used in the analysis. The atmospheric variables are derived from the ERA5-Land atmospheric reanalysis. The data on landslide events are retrieved from “FraneItalia”, a georeferenced catalog of landslides occurred in Italy developed by consulting online sources from 2010 onwards. The models developed were calibrated and validated in order to define combinations of rainfall variables and soil water content for the prediction of the occurrence of landslides. Finally, the performance of the models was assessed using statistical indicators derived from contingency matrices.

Concepts for Improving Machine Learning Based Landslide Assessment

Machine learning for landslides prevention: a survey

Article Open access 22 November 2020

Regional-scale landslide modeling using machine learning and GIS: a case study for Idukki district, Kerala, India

Article 23 April 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Assessing the occurrence of shallow rainfall-induced landslides is crucial for engaging in effective short-term and long-term risk protection actions.

Landslide early warning systems (LEWSs) are non-structural, cost-effective tools aimed at mitigating landslide risk that can be designed and used at different scales or resolutions: local systems deal with a single landslide system at slope scale, territorial systems deal with multiple landslides at regional scale, i.e., over a basin, a municipality, a region, or a nation [1]. Various LEWSs, operating at different spatial scales are currently operational worldwide [2,3,4].

Alert models for swallow rainfall-induced landslides at regional scale are typically based on rainfall thresholds expressed in terms of cumulative rainfall or average intensity with respect to the duration of the rainfall event, completely neglecting the antecedent conditions [5]. It means that for each expected precipitation event of a certain intensity, regardless of the state of the soil, the system releases a certain level of alert, increasing the probability of false alarms. In this regard, recent studies have shown that the reliability of an alert model could be increased by including other variables, such as water content [6]. However, testing numerous atmospheric variables using traditional techniques can be quite complex and costly.

Machine learning (ML) techniques, are becoming popular to model complex problems for landslide analyses [7], as they are starting to demonstrate good predictive performance compared to conventional methods. Upon the introduction of ML methods to the landslide community, many studies have been carried out to explore the usefulness of ML in landslide research and to look at some classic landslide problems from an ML point of view. The main areas of investigation include landslide detection and mapping [8,9,10], landslide spatial analyses [11,12,13], temporal forecasting of landslides [14, 15].

With reference to temporal forecasting of rainfall-induced landslides, ML methods are recently being explored to determine rainfall thresholds, that are the rainfall conditions that, when reached or exceeded, are likely to trigger a landslide [16,17,18].

In [19], the rainfall indices of 60-min cumulative rainfall have been used and a soil–water index has been calculated to set up a critical line for a new early-warning system, employing a Radial Basis Function Network.

ML methods have also been employed to explore the relationship between amount of precipitation and groundwater level, with reference to conditions of instability, especially for deep-seated landslides. In particular, in [20], Artificial Neural Networks (ANN) and Support Vector Machines (SVM) have been used to build two nonlinear time-series models able to predict groundwater level fluctuations based on data for the groundwater level, precipitation, and the tide level. In [21], Random Forest (RF) has been adopted to build a predictive model of the fluctuation of the groundwater level for the Kostanjek landslide. In [22], SVM, combined with chaos theory, has been adopted to predict the daily groundwater levels of the Huayuan landslide and the weekly, monthly groundwater levels in Baijiabao in the TGRA of China. Finally, in [23], two ML methods based on the combination of genetic algorithms and an ANN and a SVM, respectively, have been studied to predict the ground water level fluctuation of the Duxiantou landslide placed in Zhejiang Province, China.

In this context, more extensively and deeply investigated in [24], this preliminary study proposes the use of ML techniques to select the most relevant variables for the triggering of rainfall-induced landslides at regional scale. The ML method used for predicting the occurrence of landslides is the Likelihood-Fuzzy Analysis (LFA) [25], which , differently from other state-of-the-art ML methods, focuses on two objectives, i.e., the good performances and the production of explainable models. In particular, it is mainly based on the hybridization of multiple approaches (feature selection, fuzzy logic, statistics, and rule-based models) to produce predictive models able, at the same time, to present good performances, in terms of both recall and precision, and explainable results, in terms of linguistically-ranged input variables and explicit correlations detected between them and the final outcomes.

The predictive models were tested in one of the alert zones defined by civil protection for the management of geo-hydrological risk in Campania region, Italy. Two data sources were used in the analysis. The atmospheric variables are derived from the ERA5-Land atmospheric reanalysis [26]. The data on landslide events are retrieved from “FraneItalia”, a georeferenced catalog of landslides occurred in Italy, developed by consulting online sources from 2010 onwards [27]. The models were calibrated and validated in the Camp-3 alert zone over a period of time spanning from 2010 and 2019 [28] in order to define combinations of rainfall variables and soil water content for the prediction of the occurrence of landslides.

This paper is organized as follows. Section 2 explains the theoretical concepts of the Likelihood-Fuzzy Analysis technique used herein. Section 3 describes the datasets and the experimental procedure. Section 4 evaluates the performance of the proposed approach in the case study, also in comparison with other state-of-the-art ML methods. Finally, Sect. 5 summarizes the conclusions carried out in this work.

2 Background

2.1 Likelihood fuzzy analysis

LFA is a method proposed in [29, 25], aimed to mine, from a training supervised dataset made of different features/variables, one or more multivariate fuzzy models able to predict a class y, related to a new incoming feature vector $x = \{x^{(1)},..., x^{(n)}\}$.

This method was chosen for this study, since it is able to generate models exhibiting the following characteristics: (i) good performances, with respect to state-of-the-art methods, and robustness to uncertain data; (ii) interpretability, describing features through terms of linguistic variables (low/medium/high) and correlating them with the outcomes with if-then rules, to show the dependence of predictions on features; (iii) confidence measure of each prediction, expressed as probability of the class of interest to each occurrence of input data.

A multivariate fuzzy model is made of two main parts, namely the fuzzy sets associated to the features of interest, and the rule base, and is used for classifying objects through the process of fuzzy inference.

In more detail, the range of each j-th feature X(j), is partitioned into $M_j$ fuzzy sets, described by membership functions $\mu _F^{(j)}$ with specific positions in the admissible range of values. The fuzzy sets pertaining to each feature represent the terms of the associated linguistic variable (e.g., low, medium, high).

The model is made of a combinatorial set of R rules, where the $\sigma$-th rule is of the following type and $\sigma$ $\in$ [1,...,R]:

$$\begin{aligned} if \; x^{(1)} \; is \; F^{(1)}_\sigma \; and \; ...\; and\; x^{(n)}\; is\; F^{(n)}_\sigma \; then\; P_\sigma (c_1)\; ....\; P_\sigma (c_K) \end{aligned}$$

(1)

where $c_1,c_2 ... c_K$ are the different K output classes.

The inference process is performed as follows. Each data sample $x=\{x^{(1)},...,x^{(n)}\}$ fires the $\sigma$-th rule with a strength:

$$\begin{aligned} FS_{\sigma }(x)=\prod _{j=1}^{n}\mu _{F^{(j)}_{\sigma }}(x^{(j)}) \end{aligned}$$

(2)

The implication of each consequence class is modelled as:

$$\begin{aligned} IMP_{\sigma }(x)= FS_{\sigma }(x)\cdot P_\sigma (c_k) \end{aligned}$$

(3)

Finally, different implications are aggregated as:

$$\begin{aligned} AGG(x)= \sum _{\sigma =1}^{R}IMP_{\sigma }(x) \end{aligned}$$

(4)

and aggregations of all the classes are normalized. In general, weights are associated to rules, which implies to perform a weighted sum in (4), which are omitted for single-feature models, and for multivariate models if they do not improve classification significantly.

In case of two classes $c_1$ and $c_2$ as output, (4) gives a number in [0,1] that approximates the probability of $c_1$ class. Once a threshold T is chosen, the final inference result is:

$$\begin{aligned} y = \bigg \{ \begin{array}{rl} c_1 &{} if \,AGG(x) > T\\ c_2 &{} otherwise \\ \end{array} \end{aligned}$$

(5)

LFA aims to determine a fuzzy model by optimizing a chosen performance measure on a given dataset, in particular through optimization of the fuzzy sets representing the linguistic terms of each variable, and optimization of the number of terms for each variable, of the set of variables making up the model, and fuzzy rules. The fundamental passages of LFA are as follows.

Firstly, the likelihood functions are calculated, which describe the posterior probabilities of classes $P(c_k\mid x^{(j)})$ as functions of each of the input features $x^{(j)}$. Then, each of these functions is approximated with a linear combination of membership functions of fuzzy sets, which constitute an interpretable partition of the variable range. Finally, rule weights (if foreseen) and consequents of a complete multivariate rule base are calculated to get the fuzzy model. More details are given in [25, 29].

3 Materials and methods

3.1 Study area and data used

The study area is Camp-3, one of the eight alert zones defined by the Civil Protection for the management of hydro-meteorological risk in Campania (Italian DPGR 299/2005). This area, with an extension of approximately 1619 km², includes 109 municipalities and the Lattari, Picentini and Partenio mountains (Fig. 1).

The orographic conditions and the proximity of the sea favor the formation of convective storms [30, 31]. Moreover, the presence of pyroclastic deposits of volcanic origin on carbonate substrates makes these areas highly susceptible to the triggering of fast-moving landslides, such as shallow landslides, debris flows, debris avalanches, and hyperconcentrated flows [32]. Some of the most catastrophic landslides in Europe were recorded in the area, including the tragic events that occurred on the Pizzo d’Alvano massif between 4 and 5 May 1998 when about 2 million m³ of material fell down, causing at least 160 victims [33].

The information on landslides occurred in the study area was retrieved from FraneItalia, a georeferenced catalog of recent Italian landslides developed by consulting online sources from 2010 onwards [27]. Landslides are classified considering two numerous categories: single landslides (SLE), for records that report a single landslide; areal landslides (ALE), for records that refer to multiple landslides caused by a single trigger in the same Weather Alert Zone. In Camp-3, 120 rainfall-induced landslide events (72 SLE and 48 ALE) were recorded from 2010 to 2019, most of which (96 out of 120) occurred between October and March.

The rainfall and soil water content data are derived from the ERA5-Land atmospheric reanalysis [34], developed by the European Center for Medium-Range Weather Forecasts (ECMWF). Atmospheric reanalysis provides a consistent and complete picture of the atmosphere by combining observational data from satellites and ground sensors with physically-based meteorological models. ERA5-Land provides about 50 atmospheric variables available at a spatial resolution of 9 km and an hourly temporal resolution. Because of the importance of soil processes and an adequate parameterization of the processes, ERA5-Land represents, strictly speaking, a “replaying” of the soil component alone, while it uses statistical interpolations for the atmospheric variables of the forcings produced by the ERA5 atmospheric reanalysis (with spatial resolution of about 31 km).

3.2 Methodology

This study moves from the assumption that at this scale, i.e. considering the entire study area of almost 2000 km² as a whole, rainfall-induced landslides can be correlated to a combination of measures linked to two factors: (i) a predisposing condition represented by the water content in the surface layers of the soil and (ii) a trigger condition represented by the rainfall variables [6].

The hourly data of the ERA5-Land dataset were pre-processed in order to obtain 13 input variable features, calculated with a daily temporal discretization consistent with the information contained in the catalog of landslides used as dependent variable outcome, that is to say:

daily maximums of the geographical averages of rainfall values, cumulated over time intervals of 1 h, 3 h, 6 h, 9 h, 12 h, 18 h, 24 h, 36 h, 48 h and 72 h (from F1 to F10);
daily maximum of the geographic standard deviation of the hourly rainfall values (F11);
daily maximum of the geographical average of the hourly values of the soil water content (F12);
daily maximum of the geographical standard deviation of the hourly values of the soil water content (F13).

To find a model that associates an outcome (landslide/no landslide, or landslide probability) to the known data of rainfall and soil water content, LFA method described in Sect. 2 was applied. The Mathematica 8^{Footnote 1} software was employed for the implementation.

The performance measure chosen in this study for the optimization of fuzzy sets, number of terms for each variable, number of variables to be used and final fuzzy rules, is the Squared Classification Error (SCE):

$$\begin{aligned} SCE = \frac{1}{KN} \sum \limits _{k=1}^{K} \sum \limits _{i=1}^{N} (P(k)- \delta ^k_i)^2 \end{aligned}$$

(6)

where, P(k) is the probability of the k-th class calculated by the model, and $\delta ^k_i$ is 1 if the i-th sample is associated with the k-th class, otherwise 0.

The SCE was also used for the choice of the model, giving precedence to performance rather than to interpretability, largely guaranteed in any case by construction, as shown in the example of Fig. 2 and in the following results.

4 Results

The 13 independent variables defined by reprocessing hourly precipitation and soil water content were correlated with the positive class of 120 days with landslides occurred in Camp-3 and the negative class of the remaining days from 2010 to 2019, through univariate models.

Figure 3 shows the values of the objective function (SCE, to be minimized) for each variable. In spite of a rather limited range of error variation, a monotonous decreasing trend emerges for precipitation intervals between 1 h and 18 h (duration characterized by the minimum error), with a slight increase up to 72 h and values significantly higher for precipitation standard deviation and soil water content.

Among the monovariate models, the one with the best predictive capability is shown in Fig. 4.

By dividing the 18 h cumulative rainfall into three fuzzy sets (low/medium/high), it can be seen that the maximum probability of landslide (37%) is obtained for high values (greater than about 40 mm). It should be remembered that this value refers to an averaged precipitation over the entire study area. The graph also shows two overlying ranges: an intermediate range in which the probability of a landslide is around 23%, and a range of low values of the precipitation in which the probability of a landslide is minimal (1%).

In addition, a multivariate model combining the 18 h cumulative rainfall and the standard deviation of the soil water content was developed (Fig. 5). In particular, the model combining the two variables associates a null probability of landslides with low values of both independent variables, and allows to identify ranges in which the probability of landslides rises to 46%. Finally, the results allow to highlight the additional contribution of the soil water content.

4.1 Comparison with other ML methods

There is no consensus on an “optimal” ML method for landslide studies, even when looking at the results of the most recent comparative studies in landslide detection or spatial and temporal forecasting [24]. Therefore, even if the objective of this work is not to demonstrate that the LFA method is the best possible for this case study, to assess its effectiveness in terms of performance and interpretability, other established ML methods existing in the literature were tested.

In particular, all the methods chosen are available in the Waikato Environment for Knowledge Analysis (WEKA 3.8) [35], and can be summarized in terms of the category they belong to and their configuration parameters, as reported as follows:

Logical/symbolic classification
- RIPPER rule-based classifier [36] (with batch size 100, 3 folds for pruning, minimum total weight of the instances in a rule 2.0, 2 optimization runs);
- C4.5 decision tree [37] (with batch size 100, 3 folds for pruning, confidence factor 0.25, minimum 2 instances per leaf, subtree raising and MDL correction);
Statistical learning
- Naïve Bayes (NB) [38] (with batch size 100);
- Bayesian Network (BN) [39] (with batch size 100, Simple Estimator algorithm with alpha = 0.5 for finding the conditional probability tables, K2 learning algorithm with max 1 parent and Bayes score type);
Instance-based learning
- K-Nearest Neighbours (K-NN) [40] (with K equal to 1, batch size 100, no distance weighting, no limit to the number of training instances, brute force search algorithm for nearest neighbour search, Euclidean distance);
Function-based classification
- Logistic Regression (LR) [41] (with batch size 100, ridge value in the log-likelihood 10–8, unlimited iterations, BFGS updates);
- Support Vector Machine (SVM) [42] (with batch size 100, complexity parameter c=1.0, epsilon for round-off error 10–12, tolerance parameter 0.001, multinomial logistic regression model with a ridge estimator as calibration method with batch size 100, ridge value in the log-likelihood 10-8, unlimited iterations, BFGS updates, and polynomial kernel with cache number 250,007 and exponent 1.0);
Perceptron-based learning
- Multi-layer perception network (MLP) [43] (with batch size 100, one hidden layer with number of neurons equal to (attributes + classes)/2, learning rate 0.3, momentum 0.2, no decay of learning rate, 500 epoch;
Ensemble learning
- Random Forest(RF) [44] (with batch size 100, and number of trees in the random forest 100).

The performance of these ML models was calculated through a 10 fold cross-validation, in terms of F1 score, Precision and Recall metrics.

As far as the performance is analyzed, the results reported in Table 1 reveal that the models characterized by the best performances are those obtained by LFA and the NB method, as both reach an F1 of 0.27, Moreover, the models obtained with the LFA method were superior in terms of F1 score with respect to those obtained with the other state-of-the-art (interpretable and not) ML methods tested in this work. Therefore, this confirms both the validity of the results achieved, and the model’s applicability to support the prediction of the occurrence of rainfall-induced landslides.

Table 1 Performance indicators associated with the tested machine learning models (the models obtained through the selection of variables are indicated with an asterisk)

Full size table

Moreover, the interpretability of models obtained by LFA is not comparable with the other tested ML methods. Indeed, LFA allows obtaining models able to give a clear explanation of the inference process, based on a rule base built on the top of interpretable linguistic terms. For example, with regards to the two-dimensional model above described, functions represented in Fig. 5 allow to clearly distinguish linguistic labels of both the maximum of 18h cumulated rainfall and the standard deviation of the water content. Given these labels, the 2-dimensional rule base clearly state its logical consequence in terms of probability of rainfall-induced landslides. Indeed, it is possible to distinguish the case when “the maximum of 18h cumulated rainfall is low”, which implies a probability of landslides equal to zero (or almost zero), and the remaining cases when “the maximum of 18 h cumulated rainfall is medium or high” and, depending on when “the standard deviation of the water content is low or high”, the probability of landslides increases until it reaches the value of 46%.

Furthermore, the use of fuzzy logic behind the LFA method naturally offers the possibility of creating robust models to uncertain data, i.e. capable of generating small changes to the output in response to small changes to the input features. In fact, the LFA method (as well as LR and statistical methods) provides an output probability that continuously varies in the feature space. Therefore, it is robust with respect to the uncertainty of both the input feature and the output.

Finally, by means of these output probabilities, the models found by LFA allow associating a confidence to the prediction of new cases, and reflects, more than sensitivity or specificity, the real uncertainty associated with each potential rainfall-induced event. It is worth noticing that some other ML methods (like LR and statistical methods) also produce confidence grades linked to responses; however, to the best of our knowledge, the LFA method is the only one that, in addition of giving a fully interpretable model, approximates the probabilities associated to each outcome.

With respect to all the other ML methods mentioned in Table 1, it is worth noting that none of them simultaneously presents all the described characteristics. Indeed: (i) MLP, instance-based methods and SVM generate models that are not interpretable at all; (ii) RIPPER rule based classifier is not able to assess a classification confidence in terms of outcome probabilities; (iii) C4.5 decision trees and RF are not robust to handle uncertainty of data, since they define sharp boundaries in the feature space; (iv) statistical methods (NB and BN), and LR generate models with a level of interpretability not comparable to LFA, since they are not based on logical rules and linguistic terms able to more clearly highlight the relationships among the input features and the final prediction.

Summarizing, the LFA method has shown to be a valid support for identifying the most relevant variables to trigger shallow rainfall-induced landslides and for clearly representing their relations with the predicted outcome, thanks to the model interpretability. Moreover, the good performance of models found in the present work and the possibility of producing robust and confidence-based results confirm that the LFA method can be proficiently applied, in place of more classical ML approaches, for building rainfall-induced landslide alert models.

5 Conclusions

The use of machine learning techniques for the definition of models that combine, at regional scale, variables of rainfall and soil water content for the prediction of rainfall-induced landslides was tested in this preliminary study, in the Camp-3 alert zone in a period between 2010 and 2019.

The developed models made it possible to identify some variables significantly correlated with the considered landslides and made it possible to calculate the probability of occurrence of the events rather than simple dichotomous relationships.

Some possible future developments of the study could be: (i) comparisons with other alert models (e.g. model used by the Campania region, empirical rainfall thresholds used in literature); (ii) development of analyses taking into consideration only the most numerous areal events; (iii) calibration and validation of models that use other potentially relevant variables.

Data availability

The data that support the findings of this study are available in FraneItalia repository with the identifier https://doi.org/10.17632/zygb8jygrw.3 and in ERA5-Land repository with the identifier https://doi.org/10.24381/cds.adbb2d47.

Notes

http://www.wolfram.com/mathematica/.

References

Calvello M. Early warning strategies to cope with landslide risk. Rivista italiana di geotecnica. 2017;2(17):2. https://doi.org/10.19199/2017.2.0557-1405.063
Article Google Scholar
Piciullo L, Calvello M, Cepeda JM. Territorial early warning systems for rainfall-induced landslides. Earth-Sci Rev. 2018;179:228–47. https://doi.org/10.1016/j.earscirev.2018.02.013.
Article Google Scholar
Pecoraro G, Calvello M, Piciullo L. Monitoring strategies for local landslide early warning systems. Landslides. 2019;16(2):213–31. https://doi.org/10.1007/s10346-018-1068-z.
Article Google Scholar
Guzzetti F, Gariano SL, Peruccacci S, Brunetti MT, Marchesini I, Rossi M, Melillo M. Geographical landslide early warning systems. Earth Sci Rev. 2020;200: 102973. https://doi.org/10.1016/j.earscirev.2019.102973.
Article Google Scholar
Segoni S, Piciullo L, Gariano SL. A review of the recent literature on rainfall thresholds for landslide occurrence. Landslides. 2018;15(8):1483–501. https://doi.org/10.1007/s10346-018-0966-4.
Article Google Scholar
Reder A, Rianna G. Exploring era5 reanalysis potentialities for supporting landslide investigations: a test case from campania region (southern Italy). Landslides. 2021;18(5):1909–24. https://doi.org/10.1007/s10346-020-01610-4.
Article Google Scholar
Ma Z, Mei G, Piccialli F. Machine learning for landslides prevention: a survey. Neural Comput Appl. 2021;33(17):10881–907. https://doi.org/10.1007/s00521-020-05529-8.
Article Google Scholar
Stumpf A, Kerle N. Object-oriented mapping of landslides using random forests. Remote Sens Environ. 2011;115(10):2564–77. https://doi.org/10.1016/j.rse.2011.05.013.
Article Google Scholar
Keyport RN, Oommen T, Martha TR, Sajinkumar K, Gierke JS. A comparative analysis of pixel-and object-based detection of landslides from very high-resolution images. Int J Appl Earth Observ Geoinform. 2018;64:1–11. https://doi.org/10.1016/j.jag.2017.08.015.
Article Google Scholar
Prakash N, Manconi A, Loew S. Mapping landslides on eo data: performance of deep learning models vs. traditional machine learning models. Remote Sens. 2020;12(3):346. https://doi.org/10.3390/rs12030346.
Article Google Scholar
Pourghasemi HR, Rahmati O. Prediction of the landslide susceptibility: which algorithm, which precision? Catena. 2018;162:177–92. https://doi.org/10.1016/j.catena.2017.11.022.
Article Google Scholar
Chen W, Pourghasemi HR, Zhang S, Wang J. A comparative study of functional data analysis and generalized linear model data-mining techniques for landslide spatial modeling. In: Pourghasemi, H.R., Gokceoglu, C. (eds.) Spatial Modeling in GIS and R for Earth and Environmental Sciences, pp. 467–484. Elsevier, Amsterdam, The Netherlands (2019). https://doi.org/10.1016/B978-0-12-815226-3.00021-1
Merghadi A, Yunus AP, Dou J, Whiteley J, ThaiPham B, Bui DT, Avtar R, Abderrahmane B. Machine learning methods for landslide susceptibility studies: a comparative overview of algorithm performance. Earth Sci Rev. 2020;207: 103225. https://doi.org/10.1016/j.earscirev.2020.103225.
Article Google Scholar
Huang L, Xiang L-Y. Method for meteorological early warning of precipitation-induced landslides based on deep neural network. Neural Process Lett. 2018;48(2):1243–60. https://doi.org/10.1007/s11063-017-9778-0.
Article Google Scholar
Stanley TA, Kirschbaum DB, Sobieszczyk S, Jasinski MF, Borak JS, Slaughter SL. Building a landslide hazard indicator with machine learning and land surface models. Environ Model Softw. 2020;129: 104692. https://doi.org/10.1016/j.envsoft.2020.104692.
Article Google Scholar
Vallet A, Bertrand C, Varron D, Mudry J. Hydrogeological threshold using effective rainfall and support vector machine (svm) applied to a deep seated unstable slope (séchilienne, french alps). In: Journée Aléa Gravitaire, pp. 1–6 (2013)
Rachel N, Lakshmi M. Landslide prediction with rainfall analysis using support vector machine. Indian J Sci Techno. 2016;9(21).
Omadlao ZRD, Tuguinay NMA, Saturay Jr RM. Machine learning-based prediction system for rainfall-induced landslides in benguet first engineering district. 2019.
Osanai N, Shimizu T, Kuramoto K, Kojima S, Noro T. Japanese early-warning for debris flows and slope failures using rainfall indices with radial basis function network. Landslides. 2010;7(3):325–38.
Article Google Scholar
Yoon H, Jun S-C, Hyun Y, Bae G-O, Lee K-K. A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer. J Hydrol. 2011;396(1–2):128–38. https://doi.org/10.1016/j.jhydrol.2010.11.002.
Article Google Scholar
Krkač M, Špoljarić D, Bernat S, Arbanas SM. Method for prediction of landslide movements based on random forests. Landslides. 2017;14(3):947–60.
Article Google Scholar
Huang F, Huang J, Jiang S-H, Zhou C. Prediction of groundwater levels using evidence of chaos and support vector machine. J Hydroinform. 2017;19(4):586–606.
Article Google Scholar
Wei Z-L, Lü Q, Sun H-Y, Shang Y-Q. Estimating the rainfall threshold of a deep-seated landslide by integrating models for predicting the groundwater level and stability analysis of the slope. Eng Geol. 2019;253:14–26
Article Google Scholar
Tehrani FS, Calvello M, Liu Z, Zhang L, Lacasse S. Machine learning and landslide studies: Recent advances and applications. Nat Hazards. 2022;1–49.
Pota M, Esposito M, De Pietro G. Likelihood-fuzzy analysis: from data, through statistics, to interpretable fuzzy classifiers. Int J Approx Reason. 2018;93:88–102. https://doi.org/10.1016/j.ijar.2017.10.022.
Article MathSciNet MATH Google Scholar
Hersbach H, Bell B, Berrisford P, Biavati G, Horányi A, Muñoz Sabater J, Nicolas J, Peubey C, Radu R, Rozum I, Schepers D, Simmons A, Soci C, Dee D, Thépaut J-N. ERA5 hourly data on single levels from 1959 to present (2018). https://doi.org/10.24381/cds.adbb2d47
Calvello M, Pecoraro G. Franeitalia: a catalog of recent Italian landslides. Geoenviron Disasters. 2018. https://doi.org/10.1186/s40677-018-0105-5.
Article Google Scholar
Piciullo L, Tiranti D, Pecoraro G, Cepeda JM, Calvello M. Standards for the performance assessment of territorial landslide early warning systems. Landslides. 2020;17(11):2533–46.
Article Google Scholar
Pota M, Esposito M, De Pietro G. Designing rule-based fuzzy systems for classification in medicine. Knowl Based Syst. 2017;124:105–32. https://doi.org/10.1016/j.knosys.2017.03.006.
Article Google Scholar
Fortelli A, Scafetta N, Mazzarella A. Nowcasting and real-time monitoring of heavy rainfall events inducing flash-floods: an application to phlegraean area (central-southern Italy). Nat Hazards. 2019;97(2):861–89. https://doi.org/10.1007/s11069-019-03680-7.
Article Google Scholar
Scafetta N, Mazzarella A. On the rainfall triggering of phlegraean fields volcanic tremors. Water. 2021;13(2):154. https://doi.org/10.3390/w13020154.
Article Google Scholar
Cascini L, Ferlisi S, Vitolo E. Individual and societal risk owing to landslides in the campania region (southern Italy). Georisk. 2008;2(3):125–40. https://doi.org/10.1080/17499510802291310.
Article Google Scholar
Haque U, Blum P, Da Silva PF, Andersen P, Pilz J, Chalov SR, Malet J-P, Auflič MJ, Andres N, Poyiadji E, et al. Fatal landslides in Europe. Landslides. 2016;13(6):1545–54. https://doi.org/10.1007/s10346-016-0689-3.
Article Google Scholar
Muñoz-Sabater J, Dutra E, Agustí-Panareda A, Albergel C, Arduini G, Balsamo G, Boussetta S, Choulga M, Harrigan S, Hersbach H, et al. Era5-land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst Sci Data. 2021;13(9):4349–83. https://doi.org/10.5194/essd-13-4349-2021.
Article Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The weka data mining software: an update. ACM SIGKDD Explor Newsl. 2009;11(1):10–8. https://doi.org/10.1145/1656274.1656278.
Article Google Scholar
Cohen WW. Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Machine Learning Proceedings 1995, pp. 115–123. Morgan Kaufmann, San Francisco (CA) (1995). https://doi.org/10.1016/B978-1-55860-377-6.50023-2. https://www.sciencedirect.com/science/article/pii/B9781558603776500232
Quinlan JR. Improved use of continuous attributes in c4.5. J Artif Intell Res. 1996;4:77–90. https://doi.org/10.1613/jair.279.
Article MATH Google Scholar
Cooper GF, Herskovits E. A Bayesian method for the induction of probabilistic networks from data. Mach Learn. 1992;9(4):309–47. https://doi.org/10.1007/BF00994110.
Article MATH Google Scholar
Pearl J. Bayesian netwcrks: A model cf self-activated memory for evidential reasoning. In: Proceedings of the 7th Conference of the Cognitive Science Society, University of California, Irvine, CA, USA, pp. 15–17 (1985)
Dutton DM, Conroy GV. A review of machine learning. knowl Eng Rev. 1997;12(4):341–67. https://doi.org/10.1017/S026988899700101X.
Article Google Scholar
Cox DR. The regression analysis of binary sequences. J Royal Stat Soc Ser B. 1958;20(2):215–32. https://doi.org/10.1111/j.2517-6161.1958.tb00292.x.
Article MathSciNet MATH Google Scholar
Boser BE, Guyon IM, Vapnik VN. A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152 (1992). https://doi.org/10.1145/130385.130401
A neural network approach to predicting stock exchange movements using external factors. Knowledge-Based Systems 19(5), 371–378 (2006). https://doi.org/10.1016/j.knosys.2005.11.015. AI 2005 SI
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. https://doi.org/10.1023/A:1010933404324.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute for High Performance Computing and Networking, National Research Council of Italy, Naples, Italy
Marco Pota & Massimo Esposito
Department of Civil Engineering, University of Salerno, Fisciano, Italy
Gaetano Pecoraro & Michele Calvello
Regional Model and Geo-Hydrological Impacts (REMHI) Division, Fondazione Centro Euro-Mediterraneo sui Cambiamenti Climatici, Caserta, Italy
Guido Rianna & Alfredo Reder

Authors

Marco Pota
View author publications
You can also search for this author in PubMed Google Scholar
Gaetano Pecoraro
View author publications
You can also search for this author in PubMed Google Scholar
Guido Rianna
View author publications
You can also search for this author in PubMed Google Scholar
Alfredo Reder
View author publications
You can also search for this author in PubMed Google Scholar
Michele Calvello
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Esposito
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MP, GP, GR, AR, MC and ME wrote the main manuscript text, prepared figures and tables, and reviewed the final content. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Massimo Esposito.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pota, M., Pecoraro, G., Rianna, G. et al. Machine learning for the definition of landslide alert models: a case study in Campania region, Italy. Discov Artif Intell 2, 15 (2022). https://doi.org/10.1007/s44163-022-00033-5

Download citation

Received: 28 July 2022
Accepted: 18 August 2022
Published: 26 August 2022
DOI: https://doi.org/10.1007/s44163-022-00033-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Machine learning for the definition of landslide alert models: a case study in Campania region, Italy

Abstract

Similar content being viewed by others

Concepts for Improving Machine Learning Based Landslide Assessment

Machine learning for landslides prevention: a survey

Regional-scale landslide modeling using machine learning and GIS: a case study for Idukki district, Kerala, India

1 Introduction