Introduction

In many Italian regions, rainfall is the primary trigger of shallow landslides that can cause fatalities, widespread damages and economic losses (Guzzetti 2000). Due to its geographical, geological and geomorphological settings, Liguria region is prone to floods and landslides. In the last decades, Liguria Region has often experienced considerable Damaging geo-Hydrological Events, mainly related to intense and sometimes very localized rainfall (Silvestro et al. 2012) leading to important environmental, social and economic consequences (Cevasco et al. 2015).

For this reason, the mechanisms of triggering surface landslides by rainfall have been a topic of interest for a vast production of studies in Italy, that apply different methods and that are driven by different purposes. The ability to forecast rainfall-induced shallow landslide spatial and temporal occurrence is of primary interest, among others, to the Italian National Department of Civil Protection (DPCN) who would benefit from effective warning systems to mitigate the associated risk (Ratto et al. 2013). Physically based models, which aim at reproducing the nature of the phenomenon, coordinate a hydrologic model of the sub-surface flow, resulting from the meteoric event, with a geotechnical model for describing the stability conditions of the slope, requiring accurate knowledge and definition of the numerous physical parameters involved, which makes their application over large territories difficult (Segoni et al. 2009). On a regional scale, the most widely used methods are based on the definition of an empirical correlation between rainfall and landslides, obtained from historical data of landslide occurrence. Such models are generally obtained by deriving a mathematical equation that represents the threshold beyond which landslides have occurred in the past, and it is assumed they will occur in the future. In literature, different rainfall variables have been considered: the most common are intensity and duration of critical precipitation (Aleotti 2004; Guzzetti et al. 2008), accumulated antecedent precipitation from 1 day up to a few months (Cardinali et al. 2006; Martelloni et al. 2012), soil moisture (Segoni et al. 2018). Guzzetti et al. (2008) highlighted that a standardized procedure to define critical rainfall does not exist, as an example, the rainfall intensity (and associated duration) may refer to either the peak precipitation rate measured during the rainstorm or to an average estimate of the mean precipitation rate of the whole rainstorm. The definition of the best variables and adequate time periods for the rainfall cumulates are not straightforward (Martelloni et al. 2012). The choice of the right parameters to consider depends primarily on the landslide typology: there is a general agreement in recognizing that debris flow and shallow landslides are preferentially triggered by short and intense rainfalls, while deep-seated landslides are more commonly connected with prolonged and less intense rainfall events (Crosta 1998). In those areas affected by both shallow and deep-seated landslides, it is essential to define a methodology that could be flexible enough to encompass both of them. Some authors have tried to indirectly include other predisposing factors, such as lithological information or morphology, by applying calibration at a sub-regional scale (Gariano et al. 2015). As a result, existing and new thresholds are difficult to compare and evaluate quantitatively (Guzzetti et al. 2008; Brunetti et al. 2010). These methods focus on the temporal scale, considering individual landslides as discrete point events, and thus do not provide spatial information, while landslide hazard management would require the combined anticipation of “when”, but also “where” and “how large” a landslide is expected to be (Lari et al. 2014). Recent studies have worked in combining the spatial and temporal occurrence prediction of landslides, by applying statistical or machine learning algorithms (Lombardo et al. 2020).

Data-driven methods identify patterns or trends by analysing and interpreting the relationship between data. These methods have become increasingly popular over the last few years, thanks to the availability of big data and improvements in computational power. Machine learning models, such as random forest, artificial neural network, support vector machine and ensemble techniques, have been successfully applied in landslides detection based on images (Yu et al. 2017), landslide susceptibility analysis (Park 2019; Reichenbach et al. 2018; Lee et al. 2020; Tien et al. 2016; Kumar et al. 2017) and prediction (Huang and Xiang 2018).

The objective of the present work is to respond to the need for a Landslide Early Warning System (LEWS) for the Liguria Civil Protection Department, with a predictive model of rainfall-induced shallow landslides occurrence in the five Alert Zones (AZs) in which Liguria region is divided, that can be easily integrated in the existing operational early warning system for weather forecasting. Simplicity, low computation cost and explainability—that is, the ability to understand the patterns that the model uses to link to the training datasets and the factors that influence its outputs—were crucial factors for the selection of the model.

Material and methods

The study area

The Liguria Region lies between the Ligurian Alps and the Ligurian Apennines to the north and faces the Ligurian Sea to the south, with an unbroken chain that forms a veritable ridge.

From the line of the watershed, an asymmetrical arrangement of the slopes can be observed, involving an average higher energy value along the Tyrrhenian sector than in the Po valley. In particular, the maritime side, in the Apennine stretch, presents an arrangement of the valleys parallel to the coastline, unlike the Alpine area where the valley axis has an average submeridian structure. Geological information can be found on the official regional cartographic website (http://www.banchedati.ambienteinliguria.it/).

Due to its geographical position and morphology, Liguria is prone to various types of weather scenarios, among them, two main types of precipitation can be distinguished: the thunderstorm, which is more sudden and intense, and the advective one, which is generally weaker in intensity but more persistent.

The analysis of precipitation fields shows that the eastern part of the region is rainier than the western part, not only in terms of accumulated precipitation (Fig. 1), but also in terms of the number of rainy days, daily precipitation values and consecutive dry and rainy days (Agrillo and Bonati 2013).

Fig. 1
figure 1

Cumulative precipitation (mm)—annual average 1961–2010 (Agrillo and Bonati 2013)

The operative system for flood and landslide risks in Liguria is based on the division of the regional area into five Alert Zones (AZs), defined for hydro-meteorological monitoring and early warning (Fig. 2). The subdivision is based on physiographic zoning that respects the integrity of catchment areas, municipal administrative areas, extension on spatial scales compatible with the limits of forecast reliability and the distinction into homogeneous climatic areas. The hydrogeological and hydraulic risk is thus determined for each AZ.

Fig. 2
figure 2

Identification of the Alert Zones

Rainfall data

The reference document that defines the hydrogeological/hydraulic warning procedure (D.G.R. 1116 of 23/12/2020) identifies two specific quantities to be evaluated and compared to a defined threshold for the Hydrological Assessment procedure: the 3-h cumulative precipitation height averaged over 100 km2, which refers to the precipitation intensity, and the cumulative precipitation height in 12 h averaged over Alert Zone, which refers to precipitation quantity. These two rainfall-related variables were then considered as predictors for the developed model, as they are daily evaluated from the meteorological forecast models for the hydrogeological/hydraulic assessment procedure. The variables were estimated starting from the hourly measurements from the ground network of the Liguria Region, for the period 2014–2019. Based on 120 rain gauges distributed as shown in Fig. 3, the measurements were interpolated using the inverse distance weighting (idw) method to perform a spatial interpolation with a known scattered set of points. Using a weighting power equal to 2 and a radius of 0.2 degrees along the x and y coordinates, the application of the idw method has defined the 1-h interpolated rainfall map of the entire Liguria domain.

Fig. 3
figure 3

Spatial distribution of weather stations in the Alert Zone in Liguria region

Based on the observed and interpolated 1-h rainfall measurements, the two different variables previously described were identified: (1) the 12-h rainfall accumulation map that represents the accumulated rainfall in the defined time period and (2) the 3-h rainfall peak over a domain of 100 km2, that represents the maximum rainfall in the defined time period and domain.

The 12-h accumulated rainfall map was averaged over each AZ domains, while for the 3-h rainfall peaks the maximum value for each AZ domain was computed. The so obtained 12-h accumulated and the 3-h rainfall peak time series, saved with daily frequency, have been finally considered input features.

We point out that single rainfall events were not identified and thus single events that cover two different days are considered separately for the amount of rain measured for the single solar day.

Soil moisture data

Soil moisture was estimated through the continuous and distributed hydrological model continuum (Silvestro et al. 2013, 2015; Cenci et al. 2016). It schematizes the behaviour of the root zone with a modified version of the Horton method (Gabellani et al. 2008), which simulates the maximum water storage with a tank of maximum dimension Vmax(x,y), while the variable V(t,x,y) describes the variation of the storage along time t, in the location x,y. It is possible to evaluate the relative soil moisture (SM) with the formulation:

$$SM=\frac{V(t,x,y)}{{V(x,y)}_{max}}$$

The model was used and implemented for many applications (Poletti et al. 2019; Silvestro et al. 2021; Lagasio et al. 2022), especially for flood forecast and water management purposes. In this case, an implementation quite similar in terms of calibration and parameterization strategy to the one found in various works (Davolio et al. 2017; Bruno et al. 2021) was carried out.

The spatial resolution is set to 170 m, while the time resolution is 1 h. Since the model solves both mass and energy balances, continuous simulation allows the estimation of soil moisture based on a physical approach, the final product is a map that covers all Liguria Region at a high resolution (Fig. 4).

Fig. 4
figure 4

Example of rainfall map derived by interpolating the observed measurements (left image) and soil moisture map derived by the hydrological model (right image)

Soil moisture, obtained from hydrological reanalysis of the period considered, was then aggregated at the spatial scale of the Alert Zone, estimating the mean value in order to be used as an input feature in the proposed algorithm.

Landslide data and scenarios

For the analysed period 2014–2019, there were 359 days with conditions leading to 2191 shallow landslides (debris and earth slides, debris flows), while deep landslides and collapses due to gravity were excluded (Fig. 5). We obtained information on landslides occurrence from various sources like newspapers, technical reports provided by Liguria Region Civil Protection, municipal damage report sheets, Fire Brigade, Liguria Region Soil Defense Sector. Figure 6 shows the spatial distribution of landslides recorded per year. The fact that the number of landslides is higher in the latest years can be determined by the increased efficiency in identifying and recording the landslides. We acknowledge that, because of incompleteness inherent to the landslide information, the number of documented landslides is a subset of all rainfall-induced landslides that occurred in the 2014–2019 period.

Fig. 5
figure 5

Histogram of the number of registered landslides per day

Fig. 6
figure 6

Spatial distribution of landslides recorded in the database per year

Days with no occurrence of landslides represent about 95% of the whole dataset, and the maximum number of landslides occurred in one single day and AZ is 45 (Fig. 5).

In accordance with the Regional Civil Protection guidelines (D.G.R. n. 1116 /2020), the considered Alert Scenarios are:

  • Low: absence or low probability of phenomena at the local level

  • Ordinary: localized phenomena/effects on the ground

  • Moderate: widespread phenomena (in single basins or portions of the alert area)

  • High: numerous and/or typically extended phenomena over an entire alert area

Based on the study of the spatial distribution and the number of landslides within each AZ and their comparison, number of landslides was attributed to each scenario as shown in Table 1. Figure 7 shows the occurrence of the different scenarios for each AZ in the considered period.

Table 1 Scenarios and alert scale definition
Fig. 7
figure 7

Occurrence (in log scale) of the different types of scenarios for each Alert Zone

Model implementation

Machine learning algorithms are used to model non-linear relationships between the input variables (also called predictive variables) and the output variable, and have been applied to “learn” the relationship among landslide occurrence and landslide-related predictors (Liu et al. 2021).

Here, we introduce a polynomial kernel regularized least squares regression (KRLS) model to predict the daily number of landslides, using five features as predictors, as previously described:

  • The 12 h accumulated rainfall averaged over the AZ;

  • The 3-h rainfall peak over an area of 100 km2;

  • Soil moisture obtained at 11 p.m. of the previous day;

  • The Alert Zone;

  • Day of the year.

The soil moisture at11 p.m. of the previous day represents the predisposing element, while the precipitation for the same day represents the triggering factor, for both types of precipitations: short and intense events (3 h peak over an area of 100 km2) and prolonged rainfall events (12 h accumulated rainfall over the AZ).

The Kernel method is based on the use of a kernel function K (x, x’), symmetric and positive definite, to implicitly have a non-linear function ɸ(x) mapping the vector of input variables x from the input space X to a new dot product space F, also called feature space. Through the kernel function, we can model our data by a linear function in the feature space, while the model is no longer linear in the original input space. In this way, we can apply non-linear model computations that are essentially the same of the linear case. In particular, we refer to the regularized least squares (RLS), an algorithm defined by the square loss that consider linear functions as possible solutions:

$$f (x) = {x}^{T}w$$

where x is the vector of the input variables and w is the vector of weights. When introducing the feature map ɸ(x), described by the kernel matrix K (x, x’), we will have:

$$f(x) = {w}^{T}\phi (x)$$

which further becomes:

$${f}_{w} (x) = \sum_{j=1}^{n}K({x}_{j}x){ c}_{j}$$

where x1, …, xn are the inputs in the training set, c = (c1, …, cn) a set of coefficients and n is the number of training data. The hyper-parameters of the model are tuned by minimizing the squared loss over the available training samples. For the square loss function, the vector c of coefficients can be computed solving the following linear system:

$${Y}_{n} =\left({K}_{n}+\lambda nI\right) c$$

where Yn is the vector of observed output variables and λ a regularisation term introduced to ensure that the minimization problem is well defined. There are many possible kernel functions that can be used, a polynomial kernel was chosen (Eq. (1)), where the degree of the polynomial d is a hyperparameter to be tuned.

$$K\left(x,x'\right)=\left(x^Tx'+1\right)^d\;where\;d\in N$$
(1)

The polynomial kernel of degree d thus computes a dot product in the space spanned by all monomials up to degree d in the input coordinates. The advantage of using such a kernel as a similarity measure is that it allows us to construct algorithms in dot product spaces (Hofmann et al. 2008).

The model is trained and validated with the database described in the “Landslides data and scenarios” section. The dataset of 2018 was used for validation, while the remaining dataset was used for calibration. A K-fold cross-validation was used, with K = 10. The choice of the calibration and validation datasets is crucial, and accurate analysis was performed to select the optimum datasets and consider the inhomogeneity of registered landslides through the different years (Fig. 6).

Root mean squared error (RMSE) and the mean absolute error (MAE) were used to determine model accuracy, as well as the normalized root mean square error (NRMSE), applied per class of values. A feature importance analysis for the proposed model was carried on through permutation test.

Next, the predicted number of landslides was converted to an alert level according to the described methodology and compared with the alert level for the same day corresponding to the registered landslides. Correct predictions were consequently defined through the use of a contingency table (Table 2) as described in Stehman (1997). Results were then analysed calculating correct predictions (true positives (TP) and true negatives (TN)) and incorrect predictions (missed alarms or false negatives FN and false alarms or false positives FP), based on the level of alert: an overestimated predicted alert is considered as FP while underestimated predicted alert is considered as FN.

Table 2 Confusion matrix showing the four possible outcomes of a classifier

Several skill scores, or performance indicators, are proposed in the literature for the calculation of the performance indicators of the confusion matrix. The indicators used for the performance evaluation of the ML method investigated herein: accuracy (ACC), probability of detection (POD), probability of false detection (POFD) and efficiency index (EI).

  • Accuracy (ACC) is the most used indicator to analyse the confusion matrix. It gives an overall evaluation of the number of correct predictions (TN + TP) over the total number of predictions. The value ranges from 0 or no correct predictions, to 1 where 100% of the predictions are correct.

  • Probability of detection (POD), also called sensitivity or hit rate or true positive rate, represents the rate of occurred events positively predicted, where 0 means none of the occurred events was predicted, and 1 where all the events having occurred were predicted correctly.

  • Probability of false detection (POFD) is the counterpart of POD since it measures the ratio of forecasted events to have occurred when no events have been observed. It ranges between 0 and 1.

  • Efficiency index (EI), also called threat score or critical success index, is the ratio of true positives to the sum of the true positives and the unsuccessful predictions. The indicator evaluates the performance without considering TN, which number often lies an order of magnitude above the other classifiers in the confusion matrix.

Results and discussion

Model results

A 3-degree polynomial kernel showed the best cross-validated fitting with measured data. Validation was conducted considering the year 2018, the model was run to determine the number of landslides predicted for each AZ and results were compared with registered landslides. In the validation year, 86 days reported at least one landslide event, corresponding to 333 landslides.

The root mean squared error (RMSE = 1.5) and mean absolute error (MAE = 1.2) are deemed satisfactory, although results show a tendency to overestimate the number of landslides. The RMSE and NRMSE applied per class of values are shown in Table 3. We can observe that for higher values of the number of landslides, the RMSE increases while NRMSE decreases.

Table 3 Root mean square error (RMSE) and normalized root mean square error (NRMSE) for different classes

The differences between prediction and observation were computed in terms of alert levels: as the final output of the procedure is to provide support to identify the alert level adopted in the civil protection procedures for the early warning system, the number of predicted landslides was converted to the corresponding four alert levels: “low”, “ordinary”, “moderate” and “high” (Table 1). Results were then analysed through the use of a contingency table or confusion matrix (Table 4). We can notice, for instance, that the two “high” criticality levels were correctly predicted.

Table 4 Confusion matrix of results for the different classes of Alert Scenarios and for the whole validation dataset. The confusion matrix compares the predicted alert level (left) with the corresponding observed one (top)

Correct predictions and errors were analysed for each AZ (Table 5). We can notice that the AZs D and E have almost no landslide occurrences observed in the validation period and have a higher number of false alarms.

Table 5 Results of validation through the use of confusion matrix for each Alert Zone (right)

When analysing the performance and results through validation procedures, it is important to consider the uncertainty inherent in the incompleteness of information, especially in remote areas, where landslides may have occurred and gone unreported. This is coherent with the tendency of the model to overestimate the number of landslide events.

As an example, we report in Table 6 the comparison of predicted and registered number landslides for the same day in the different AZs, where the similar input conditions highlight the evidence that unreported landslides, especially in the remote areas of the AZ D and E, may be a cause of over estimation from the model.

Table 6 Comparison of observed and predicted landslides occurrence for different Alert Zones for a same day (29/10/2008)

Taking into account the confusion matrix (Table 4), the described skill scores were computed to quantitatively define the effectiveness of the model, as reported in Table 7.

Table 7 Skill scores based on the confusion matrix shown in Table 5 are used to perform the validation of the alert level

Feature importance

The relative importance of the single variable used for the proposed predictive models is estimated through a permutation analysis: after random re-orderings (shuffling) of the predictive variable the test statistics is recalculated (Anderson and Robinson 2001). The statistic used is the RMSE and the number of runs used to achieve stable feature rankings was 50. Figure 8 shows the increase in the percentage of the RMSE after shuffling each of the features: the higher the RMSE value when a particular feature is shuffled, the more weight it has in the model. We can observe that the soil moisture (“sm”) and the 12 h accumulated rainfall (“cum_12H”) are the most important features, but all features contribute to the model prediction. The importance of soil saturation has been highlighted in other studies (Ratto et al. 2013).

Fig. 8
figure 8

Feature importance for the proposed predictive model represented in terms of increase of RMSE (%) obtained by permutation test for each predictive feature (sm = soil moisture, cum_12H = 12 h accumulated rainfall over the AZ; peak_3H = 3-h rainfall peak over an area of 100 km.2; DOY = day of the year; AZ = Alert Zone)

Conclusions

In order to model the occurrence of landslides at a regional scale and define a reliable alert system to support the Liguria Civil Protection Department, a machine learning-based algorithm was defined and applied based on a database of recorded landslides covering the period 2014–2019. The model is based on the application of a polynomial kernel regularized least squares regression (KRLS), based on a set of five predictive variables: the Alert Zone, the day of the year, soil moisture averaged over the AZ, rain accumulated over 12 h averaged over each AZ, the 3-h rain peak over an area of 100 km2. The model provides, for each day and single AZ, the predicted number of landslides occurrence, which is then converted to the corresponding Alert Scenario, according to DGR n.1116/2020. Although the model has shown a tendency to overestimate the number of landslides, and uncertainties of the model cannot be easily quantified, also due to the uncertainty inherent in the incompleteness of observed landslide information, the performance of the model described can be considered satisfactory when considered in the context of the Hydrological Assessment procedure operative at the Civil Protection Department. The tested methodology allowed the validation of the use of the selected predictive features to predict the landslide occurrence and to identify the most significant ones. Future development of the model will consist of substituting the observed values of the hydrological variables with a forecasted value obtained from the predictive model operational at the Regional Agency for Environmental Protection of Liguria for the Civil Protection Department, moving toward an operative application, where the model will provide, at a daily basis, and separately for each AZ, the predicted level of criticality based on the weather forecasts and observed soil moisture.

The major advantages of the proposed methodology consist in fact in the input data, which have been selected to be substituted with values accessible in an operational procedure, to support in a simple and rapid way the identification of a warning level. The type of output of the model, that is the number of predicted landslides occurrence per AZ, is considered by the authors an easier parameter to be integrated into an early warning system, rather than a simple probability of exceeding a defined threshold, as it can be more easily interpreted by the operators to integrate the information into a “criticality bulletin”.

The system can be improved by further subdividing the AZs according to lithological areas, obtaining a more detailed splitting up, and a better compromise between operational and scientific constraints.

Finally, it is important to point out that the continuous updating of the database, with the combined information on rainfall and records of landslides that have occurred, makes it possible to improve the calibration of the algorithm parameters.