Implementation of hydrometeorological thresholds for regional landslide warning in Catalonia (NE Spain)

Soil moisture plays a vital role in slope stability. As water infiltrates into the soil, shear strength decreases eventually leading to failure. However, most of the existing regional-scale landslide early warning systems (LEWS) rely solely on rainfall information and use rainfall thresholds to determine if the landslide triggering conditions are met. The original version of the Catalonia region LEWS combines real-time rainfall observations and susceptibility to compute warnings. The LEWS applies a set of rainfall intensity-duration thresholds to determine if the rainfall conditions have the potential to trigger a landslide. This work explores the potential of using modelled soil moisture data in the Catalonia region LEWS. Volumetric water content (VWC) from the LISFLOOD hydrological simulations of the European Flood Awareness System and rainfall estimates have been analysed at the location of recent landslide events. Based on this data, a set of empirical hydrometeorological thresholds combining rainfall and soil moisture information has been obtained for their application into the Catalonia region LEWS. The LEWS has been run for nine months (April–December 2020) using two approaches: (i) combining susceptibility and rainfall intensity-duration (I-D) thresholds and (ii) combining susceptibility and the new hydrometeorological thresholds including soil moisture information. Generally, both LEWS approaches issued moderate or high warnings in the areas where significant rainfall accumulations were recorded. The outputs have been compared at specific locations where landslides were reported during the analysed period. Results show that at the analysed locations false positives are generally reduced when employing the hydrometeorological thresholds in the LEWS. Therefore, this approach is promising and could help improve regional scale LEWS in Catalonia.


Introduction
Rainfall-triggered landslides constitute a significant hazard in mountainous regions, causing major economic losses, physical asset damages, and fatalities (Froude and Petley 2018). Landslide Early Warning Systems (LEWS) are a suitable option to reduce landslide risk by decreasing the exposure and increasing the preparedness of communities that might be affected (Alfieri et al. 2012;Calvello 2017).
The majority of regional-scale LEWS determine if rainfall during an event has the potential of triggering landslides by employing empirical rainfall thresholds that relate landslide occurrence with certain rainfall conditions (NOAA-USGS Debris Flow Task Force 2005; Jakob et al. 2012;Tiranti and Rabuffetti 2010). Rainfall thresholds have been derived by applying heuristic and probabilistic approaches, for different geographical settings and spatial scales (Guzzetti et al. 2008;Caine 1980;Abancó et al. 2016;Brunetti et al. 2010). However, rainfall thresholds usually do not consider the important role that soil moisture plays in slope stability.
As water infiltrates into the soil during a rainfall event, pore water pressure increases, and soil shear strength decreases, eventually leading to failure (Terzaghi 1943;Bogaard and Greco 2016). Therefore, if the initial soil conditions are wet, less rainfall will be required to trigger a landslide. Conversely, more rain will be needed if the initial soil conditions are dry.
For this reason, some authors have tried to indirectly include soil moisture information to rainfall thresholds by incorporating cumulative rainfall amounts preceding the triggering of the landslide event, or by using antecedent precipitation indexes (Frattini et al. 2009;Aleotti 2004;Crozier 1999;Glade et al. 2000;Godt et al. 2009). Still, observed soil moisture conditions do not always correspond well to antecedent precipitation (Brocca et al. 2008;Longobardi et al. 2003). Consequently, the predictive value of rainfall thresholds is often low, and the number of false positives and misses may be high. With the aim of improving the predictive skill of "rainfall-only thresholds", Bogaard and Greco (2018) proposed identifying the conditions leading to landslides by combining rainfall information (trigger) and soil moisture information (cause), the so-called "hydrometeorological thresholds".
Most of the proposed hydrometeorological thresholds have been derived using soil moisture from direct in situ sensor measurements (Mirus et al. 2018;Zhao et al. 2019;Chitu et al. 2017;Oorthuis et al. 2023). The majority of these studies concluded that hydrometeorological thresholds slightly improved the performance of "rainfall-only thresholds" and helped reduce the number of false positives. However, the density of soil moisture networks is usually low, and using instrumentation soil moisture data for regional-scale warning is generally not feasible. The representativeness of the soil moisture measurements significantly decreases with the distance from the monitoring site (Wicki et al. 2020). Additionally, in many regions, soil moisture sensor networks are not available at all. Satellite soil moisture data can also be used (Thomas et al. 2019;Abancó et al. 2021). However, satellite products are generally not available in real-time. They are usually updated with a latency of a few days (Hersbach et al. 2020;Muñoz-Sabater et al. 2021). A feasible alternative consists of using information from lumped or distributed hydrological models (Posner and Georgakakos 2015;Ponziani et al. 2012;Marino et al. 2020;Chitu et al. 2017).
Up to date, the use of hydrometeorological thresholds in operational regional-scale LEWS is very scarce. Segoni et al. (2018) tested two different approaches to upgrade the early warning system of the Emilia Romagna region (Italy) by including daily soil moisture data from the TOPKAPI distributed rainfall-runoff model (Ciarapica and Todini 2002). However, the operational version has not yet been implemented.
In Catalonia (NE Spain), the existing LEWS combines information on the terrain susceptibility and the triggering rainfall to depict when and where landslides might occur (Palau et al. 2020). Neither antecedent rainfall nor soil moisture conditions are considered for the computation of warnings. Although the performance of the LEWS is generally good (Palau et al. 2020(Palau et al. , 2022, false positives are still an issue. Accounting for soil moisture in the Catalan LEWS is not straightforward as the region's network of soil moisture measurement stations is relatively sparse. The present study is aimed at exploring the potential of using modelled soil moisture data to improve the performance of the Catalan LEWS and provide guidelines for other regions where soil moisture measurements is limited. Achieving this objective requires (i) modifying the current LEWS algorithm to include soil moisture in the warning criteria and (ii) obtaining a set of empirical hydrometeorological thresholds for Catalonia combining rainfall and soil moisture.

Description of the study area
Catalonia is a region of around 32,000 km 2 located at the NE of Spain. Its climate is varied but can be classified as Mediterranean. Near the coast, the weather is mild and temperate. Inland, the climate is continental with hot summers and cold winters. The Pyrenees present a high-altitude climate, with abundant snow and temperatures below 0 °C during winter. Generally, the rainiest seasons in Catalonia are spring and autumn, except for the Pyrenees, where the rainiest season is summer. Landslides are usually triggered by either convective rainfall events with high intensities or long-lasting rainfalls with moderate intensities (Abancó et al. 2016).

Warning system
The present version of the LEWS over Catalonia (Palau et al. 2020(Palau et al. , 2022 uses a rainfall-only approach. It combines two types of information to compute warnings: (i) the terrain susceptibility and (ii) distributed rainfall estimates describing the rainfall situation. The output is a qualitative warning map, updated every time new rainfall information is available (Fig. 1).
The susceptibility map (Fig. 2) identifies the locations where landslides may occur. It was obtained by Palau et al. (2020) by applying a fuzzy logic methodology combining slope angle and land use and land cover information.
Rainfall events are identified by employing an inter-event dry period of 6 h. With the original rainfall-only approach (middle panel in Fig. 1), the intensity-duration-frequency (IDF) curves of the Fabra meteorological observatory in Barcelona (Casas et al. 2004) selected by Palau et al. (2022) are used to define four rainfall hazard levels (Fig. 3a). Then a warning matrix (Fig. 3c) combines susceptibility and the rainfall hazard to obtain a qualitative warning level map (Fig. 1, bottom panel).
In this paper, the LEWS has been set up to work with a hydrometeorological approach and employ both high-resolution rainfall information and soil moisture data (top and middle panels in Fig. 1). Similarly to the rainfall-only approach, the hydrometeorological approach applies a set of hydrometeorological thresholds (described in the "Hydrometeorological thresholds for Catalonia" section) to characterise the conditions that are prone to triggering a landslide (Fig. 3b). Then, the warning matrix is used in the same way to combine susceptibility and the hydrometeorological hazard and obtain a qualitative warning map. Fig. 1 Scheme of the overall LEWS framework. The LEWS includes several modules. The top box shows the input data. The middle box presents the warning model, which can either be run using the rainfall-only approach or the hydrometeorological approach. Finally, the bottom box indicates the LEWS outputs Landslides 20 • (2023)

Rainfall data
Herein, we have used a rainfall data set that blends rain gauge and radar measurements using kriging with an external drift (KED) (Velasco-Forero et al. 2009). Rain gauge data have been obtained from the measurements of 187 tipping-bucket rain gauges from the Meteorological Service of Catalonia (SMC). Radar quantitative precipitation estimates (QPEs) have been obtained from the composite of the observations of the SMC radar network (XRAD).
Radar QPEs have been produced from the volume scans of the C-band single polarisation Doppler radars of the XRAD with the Integrated Tool for Hydrological Forecasting (EHIMI, (Corral et al. 2009). This tool includes a chain of quality control, correction, mosaicking, and accumulation algorithms to generate QPE products from raw radar observations. The QPEs have a spatial resolution of 1 km.

Soil moisture data
In Catalonia, soil-moisture readings from monitoring data are freely available only for 17 specific sites, such as those in the Pyrenees and pre-Pyrenees (Hürlimann et al. 2014;Oorthuis et al. 2017;ICGC 2021). Since soil moisture is very heterogeneous and the network of monitoring stations is sparse and does not cover all of Catalonia, using the soil moisture readings from the available monitoring stations for the LEWS running over the entire region is not feasible. Alternatively, satellite or modelled soil moisture products can be used.
The approach implemented here relies on the simulations of soil moisture generated by the LISFLOOD hydrological simulations (Van Der Knijff et al. 2010;Burek et al. 2013) in the European Flood Awareness System (EFAS) (Thielen et al. 2009). Over the EFAS domain, LISFLOOD is set up on a 5 km grid and computes a complete water balance for each grid cell. Soil moisture water balance is calculated at three different soil layers once a day. The output soil moisture from the current day is then used to initialise LISFLOOD hydrologic simulations of the following day. In the top layer, LIS-FLOOD accounts for infiltration of precipitation, soil evaporation, and plant transpiration. At the bottom soil layer, the model accounts for deep percolation and groundwater storage in the subsoil. The description of soil moisture fluxes between the three soil layers and the subsoil is based on the assumption that the flow of water in the soil is entirely gravity-driven and always in a downward direction (Van Der Knijff et al. 2010). The EFAS LISFLOOD model was calibrated on the period 1990-2017 using an Evolutionary Algorithm developed by Hirpa et al. (2018). The modelled VWC from the top layer has been used in this study.
The VWC time series of the LISFLOOD model has been compared to the time-series of the measured rainfall and VWC at the location of two monitoring sites. Figure 4 shows the comparison at the Rebaixader site. Generally, both the LISFLOOD modelled VWC and the measured VWC are sensitive to the observed rainfall, increase after rainfall events, and are generally high coinciding with the time a debris flow was registered at the monitoring site.
The LISFLOOD volumetric water content (VWC) over Catalonia has been analysed for the period 2018-2020. The minimum and maximum volumetric water contents (VWC) simulated by the LIS-FLOOD over this period together with the 25, 50, and 75 percentiles are shown in Fig. 5. The minimum modelled VWC ranges from 0.1 to 0.2 m 3 /m 3 and the maximum from 0.4 to 0.5 m 3 /m 3 . The driest soil conditions have been exhibited at the south and at the northeastern coast, where the VWC is equal to or lower than 0.2 m 3 /m 3 for 75% of the days in the analysed period (Fig. 5e). The most humid conditions have been observed in the Pyrenees, where the modelled VWC is over 0.4 m 3 /m 3 during 25% of the days (Fig. 5c-e).

Landslide inventories
Landslide inventories are a vital element to correctly characterise the rainfall and soil moisture conditions that have led to landslides in the past. Having complete landslide inventories that include exact temporal and spatial information is often challenging (Kirschbaum et al. 2010;Peres et al. 2018). In this study, we have combined landslide inventories for different periods between 2018 and 2020 (Table 1) to obtain the most comprehensive landslide database possible in the study area.
To avoid over-fitting, two independent datasets have been used: the calibration inventory and the verification inventory (Table 1). The calibration inventory has been employed to obtain the hydrometeorological thresholds described in the "Hydrometeorological thresholds for Catalonia" section. It includes 603 entries corresponding to shallow slides and debris flows gathered from different sources: (i) the #Esllavicat citizen science initiative (Palau 2021), (ii) Catalan Civil Protection (CECAT), (iii) the road maintenance and management authorities (RMA), and (iv) the Gloria storm inventory collected by the Cartographical and Geological Institute of Catalonia (ICGC) (González et al. 2020). The calibration inventory

Original Paper
contains landslides from the periods April-December of the years 2018 and 2020, as well as landslides triggered by two significant rainfall events that affected Catalonia in October 2019, and from 20-23 January 2020 (Palau et al. 2022). It is important to stress that the landslides from this last period constitute around 64% of the evaluation inventory and were generally triggered by the more than 400 mm that fell from 20 to 23 January 2020 (Palau et al. 2022).
The verification inventory has been used to analyse the LEWS performance. It has been constructed from an additional landslide database that the Cartographical and Geological Institute of Catalonia (ICGC) provided. The verification inventory consists of 50 entries corresponding to rainfall-triggered slides or flows registered from April to December 2020. 70% of the landslides in the verification inventory were triggered during a rainfall event that affected Catalonia from 18 to 22 April 2020. Most of the events included in the verification inventory were reported by non-technical population. Thus, the triggering mechanism of some of the entries is uncertain. A few landslides can be due to river erosion at the toe of the slopes, and a few entries might be related to sediment transport and deposition processes.
It is essential to state that the available landslide inventories are biased towards areas with a high population density or close More specifically, 66% of the landslides in the verification inventory were reported on engineered slopes and road embankments. However, only a few landslides were reported in areas with a low

Original Paper
population density and relatively frequent landslides (e.g., Pyrenees or pre-Pyrenees). In addition, the time and location of some of the included landslide events are uncertain.

Assessment of the hydrometeorological conditions leading to landslides
This section is aimed at finding a set of hydrometeorological I-D-VWC thresholds that can subsequently be implemented in the LEWS to improve its performance. To do so, we have modified the currently used I-D thresholds to account for soil moisture. The hydrometeorological conditions at the location of the 603 landslides in the calibration inventory have been analysed for the period of the calibration inventory (i.e., 585 days between April 2018 and December 2020; see the first row of Table 1).
Since the precise timing of the occurred landslides is mostly unknown, each landslide entry of the calibration inventory has been related to the maximum intensity and corresponding volumetric water content recorded during the 24 h of the landslide date. Similarly, the time when the maximum rainfall intensity was recorded and corresponding volumetric water content has been found for the days during which landslides were not reported.
When considering the original rainfall-only approach, the rainfall conditions at the location of landslides can be represented as points in a 2-D space defined by rainfall intensity and rainfall duration. In this space, the I-D thresholds are curves that have the shape of a power law. This can be written as where I is the maximum floating rainfall intensity for a rainfall duration in hours D and α and β area known constants. α represents the value of the rainfall intensity when the duration is equal to 1 h and β represents the rate of growth or decline of the power law. For the thresholds used in the Catalonia LEWS, α and β were set by Palau et al. (2022). β is equal to 0.78 for all rainfall return periods as given by the Fabra meteorological observatory in Barcelona IDF curves (Casas et al. 2004).
Here, we aim to modify the warning system to consider rainfall and soil moisture when running with the hydrometeorological approach. Thus, VWC has been included as an additional variable in the formerly used rainfall intensity-duration space. (1) In the 3-D space defined by rainfall intensity (I), rainfall duration (D) and volumetric water content (VWC), the I-D thresholds can be represented as surfaces and the hydrometeorological conditions at the location of landslides as points.
To simplify the representation of the hydrometeorological data, we can benefit from the fact that the rainfall thresholds have the same growth rate and define an equivalent intensity (I eq ) for an event duration of 0.5 h as follows: Applying this concept, the I eq can be used as a substitute for I-D to evaluate the magnitude of the rainfall event. In this way, we can simplify the representation of the rainfall conditions. However, I eq  (2023) should not be used to describe the rainfall episode. In the general case, we might still be interested in working with I-D.
Using the I eq , we can then portray the rainfall-only I-D thresholds as constant functions in the I eq -VWC 2-D space (dotted lines in Fig. 6). In this space, the hydrometeorological conditions during days with landslide events and days without landslide events can be represented as points (diamonds in Fig. 6).
At most of the studied locations we have analysed 584 noevent days and only one triggering-event day. Thus, the number of points in Fig. 6 representing days without landslides is significantly higher than the number of points representing the hydrometeorological conditions when landslides were registered.
The visual inspection of Fig. 6 reveals that most landslides were triggered by rainfall events that registered intensities above the "Very Low-Low" I-D threshold. This fact could indicate that the rainfall I-D thresholds may already allow a relatively good classification between rainfall events that triggered landslide events and rainfall events that did not and agrees with the results of Palau et al. (2020Palau et al. ( , 2022. Fewer landslides have been triggered by rainfall events that registered equivalent intensities above the "Moderate-High" I-D threshold (100 mm/h). Rainfall events with such high intensities are less frequent in Catalonia. Therefore, landslides triggered by such rainfall conditions are probably not well represented in the testing inventory.
Only a few landslides have been registered for VWC between 0.15 and 0.25 m 3 /m 3 and I eq below 40 mm/h. However, due to the large number of points represented in Fig. 6, determining an obvious I eq -VWC threshold is not straightforward. To get a better overview of the hydrometeorological conditions that triggered landslides and the conditions that did not, we have plotted the number of landslide events and no-events for each 10 mm/h and 0.05 m 3 /m 3 interval (Fig. 7).
Our analysis shows that generally, the VWC when landslide events were triggered ranged between 0.30 m 3 /m 3 and 0.45 m 3 /m 3 (Fig. 7a). The largest number of recorded landslides was observed when the VWC was between 0.40 and 0.45 m 3 /m 3 and the I eq between 40 mm/h and 60 mm/h. Fewer landslides were registered with the same I eq and lower VWC. Only a few landslide events were triggered with VWC over 0.45 m 3 /m 3 . These VWC are not frequent near the coast where most landslides in the Calibration inventory are (Fig. 5). Thus, landslides triggered with such WVC are less represented in the calibration inventory.
There is a large variability in rainfall intensity and soil moisture conditions on days landslides were not registered. The number of no-events decreases with increasing rainfall intensity (Fig. 7b). The observed I eq during no-event days ranges between 0 mm/h and 100 mm/h and the VWC ranges from 0.15 m 3 /m 3 to 0.55 m 3 /m 3 . This fact can be partly explained by the limitations of the calibration inventory. It can also be partly explained by the fact that the number of analysed no-events is much larger than the number of analysed events.
Although we have not observed a strong correlation between I eq and VWC of landslide events, from Figs. 6 and 7, we have identified that it generally required higher intensities to trigger landslides when the VWC was low (dry soil) and lower intensities when the VWC was high (wet soil). Thus, there is some sort of tendency towards lower thresholds for higher VWCs and vice versa.

Proposed hydrometeorological thresholds
To obtain the hydrometeorological thresholds, we have modified the currently used I-D thresholds to demand higher intensities to trigger a landslide when the VWC is low and lower intensities when the VWC is high. To do so, we have used a trial-and-error approach

Original Paper
and employed ROC analysis (not shown here) to try to determine the thresholds' ability to separate between landslide events and no events. However, the hydrometeorological thresholds do not account for the terrain susceptibility, and rainfall thresholds already provide a relatively good classification of rainfall events that can trigger landslides and rainfall events that cannot. Identifying a significant improvement in the skill of the thresholds when accounting for VWC has been a challenge. The results of our analysis were not very conclusive and did not provide much more than simple guides and general patterns. We have used these guides as a benchmark to define the three preliminary I eq -VWC thresholds that allow us to separate between four hydrometeorological hazard classes (very low, low, moderate, and high).
If the LEWS was solely based on hydrometeorological thresholds, our interest would rely on finding a classifier that perfectly separates the I eq -VWC conditions that can lead to landslides and those that cannot. However, the Catalonia region LEWS combines information on the hydrometeorological conditions and terrain susceptibility to compute the warnings.
Finally, the three I eq -VWC thresholds presented in Fig. 3 have been proposed after multiple trials considering the expected warnings when combining the hydrometeorological hazard and susceptibility. The first threshold classifies the very low and low hydrometeorological hazards. It is key to discriminating between hydrometeorological conditions that can trigger a landslide in areas with high terrain susceptibility (Fig. 3c). The second threshold separates the low and moderate hydrometeorological hazards and determines when moderate or high warnings are computed in areas where the susceptibility is moderate. Finally, the third threshold discriminates between the hydrometeorological conditions that suppose a moderate or high hazard and is key to computing warnings in low susceptibility terrain (Fig. 3c). After all the trials, the proposed I eq -VWC thresholds have been obtained slightly arbitrarily but reproduce the expected relation between rainfall intensity and soil moisture and are similar to the I-D thresholds for VWC values relatively close to the average VWC observed in Catalonia.

Performance demonstration of the LEWS during 2020
In this section, we explore if the empirical hydrometeorological I eq -VWC thresholds proposed in the "Hydrometeorological thresholds for Catalonia" section can be used to improve the Catalonia LEWS performance. With this aim, the I eq -VWC thresholds have been applied to run the LEWS for a nine-month period from April to December 2020. Then, the LEWS outputs using the new I eq -VWC thresholds have been compared with the LEWS outputs when running the LEWS with the original rainfall-only I-D thresholds.
For the analysis of the LEWS performance, the verification inventory described in the "Landslide inventories" section has been used. Since the verification inventory is fairly incomplete, a quantitative evaluation of the LEWS performance over the whole domain was not possible. Therefore, in the following three subsections, the performance of the LEWS has been (i) qualitatively analysed in terms of the number of days during which at least one moderate or high warning has been computed and (ii) quantitatively analysed in terms of its ability to identify the occurrence of the landslides of the verification inventory. Additionally, the warning level time series have been analysed in detail at the location of the landslides of the verification inventory and at the Rebaixader and Portainé debris flow monitoring sites, where several debris flows included in the calibration inventory were recorded during the verification period.
For simplicity, herein the term first threshold, second threshold and third threshold are used to refer to the rainfall and hydrometeorological thresholds separating between the very low and low hazards, low and moderate, and moderate and high hazards, respectively.

Qualitative evaluation of the LEWS outputs
For the qualitative evaluation of the LEWS outputs, we have used a sub-division of Catalonia into hydrological subbasins (Palau et al. 2020). The mean area and standard deviation of the subbasins is 2.1 km 2 and 1.6 km 2 respectively. The LEWS outputs have been compared qualitatively in terms of the number of days during which at least a moderate or high warning was given within each of the subbasins using the rainfall-only I-D thresholds and the new hydrometeorological I eq -VWC thresholds.
Results show that most of the warnings have been computed in high susceptibility areas, mainly located at the North, coinciding with the Pyrenees, and parallel to the coastline, coinciding with the Catalan Coastal Ranges (Figs. 2 and 8). Applying the I-D thresholds, warnings have been computed for up to 34 days in two subbasins at the Pyrenees and more than 15 days in many others (Fig. 8a). Since landslides are rare events, such a large number of landslides in nine months are seldom observed within one subbasin. It could be thus argued that some of the warnings could be false alarms.
In contrast, when applying the I eq -WVC thresholds, the number of days with warnings is significantly lower, especially in the Pyrenees area where susceptibility is higher (Fig. 2). Generally, less than 7 days with warnings have been issued with the hydrometeorological approach. Furthermore, only 13 days with warnings have been computed at the two subbasins where warnings were given 34 days when applying I-D thresholds (Fig. 8b).
To understand the reason behind the significant difference in number of days with warning when using the two types of thresholds, the monthly rainfall and the mean monthly VWC from April to December 2020 have been analysed (see Fig. 9 top row). During June, July, and August 2020, the Pyrenees were affected by several high-intensity rainfall events that resulted in large rainfall accumulations. During this period, the mean VWC in the areas that were most affected by these rainfalls remained around 0.3 m 3 /m 3 (Fig. 9 bottom row).
As expected, warnings were issued less often when using the hydrometeorological approach (Fig. 10). Since the terrain susceptibility at the Pyrenes is generally high, the first threshold must be overcome to issue a "Moderate" warning. With the VWC modelled during June, July, and August 2020 at the Pyrenees, exceeding the first I eq -VWC threshold requires higher rainfall intensities than exceeding the first I-D threshold (Fig. 7). During these three months, the rainfall conditions overcame the first I-D threshold multiple times. However, the recorded intensities did not exceed the first I eq -VWC threshold as frequently.
In conclusion, our results show a significant decrease in the number of days with a warning when applying the I eq -VWC thresholds, Landslides 20 • (2023) especially in areas where the terrain susceptibility is generally high. In our case, this could be relevant because the observed mean VWC conditions in the Pyrenees area during June July and August 2020 correspond to the 50 th percentile (see Fig. 5) and therefore are relatively frequent. Therefore, using the I eq -VWC thresholds could help reduce the number of warnings issued in the Pyrenees area. Some of these warnings could be possible false alarms.

Performance at specific locations
Herein the LEWS performance has been quantitatively analysed at specific locations where landslides have been reported. In LEWS performance assessments, the landslide observations in the inventories are assumed to be complete and accurate. However, the limitations of the verification inventory outlined in the "Landslide

Original Paper
inventories" section could impact the results of the verification. To address this issue, we have first focused our efforts on understanding the LEWS performance at the locations of nine high-quality landslides that were triggered in natural slopes, for which the process type is well documented (indicated with initials in Fig. 2). By doing so, we have been able to reduce the uncertainties introduced by the landslide inventory in the verification. Then, the LEWS performance at the location of all the 50 landslides in the verification inventory has been analysed to get a more general overview of the functioning of the LEWS (Fig. 2).
For the quantitative analysis of the LEWS performance at specific landslide locations, we have considered that a warning has been issued when the computed warning level was either moderate or high. To compute a warning at the location of the landslides in moderate-susceptibility areas, the rainfall and VWC conditions must overcome the second threshold and to compute a warning in high-susceptibility terrain the rainfall and VWC must overcome the first threshold (Fig. 3). The landslides from the verification inventory have been used as reference.
The dates of reported landslides have been compared with the days during which at least a warning has been given. From the comparison of the observed landslides and the LEWS outputs, the number of true positives (TP), false negatives (FN), and false positives (FP) has been counted. TP are defined as the number of days during which at least one warning was computed coinciding with the date and location of a landslide report. FN are defined as the number of days during which no warnings were given at the time and location of a landslide during the date of the report. Finally, FP are days during which the LEWS issued a warning at a site where a landslide was not reported on that date.
To illustrate the performance of the LEWS at specific locations we have chosen two different metrics: (i) the true positive rate (TPR) and (ii) the false alarm ratio (FAR): The TPR measures the fraction of the observed landslides for which warnings were computed. The FAR measures the fraction of warnings for which landslides were not reported. The TPR and FAR values range from 0 to 1. An ideal LEWS should have no FN and no FP. Thus, the perfect TPR and FAR scores should be 1 and 0 respectively.  Table 2 shows a summary of the skill scores obtained when analysing the LEWS performance only at the location of the nine high-quality points of the verification inventory. Both configurations of the LEWS (using I-D and I eq -VWC thresholds) have the same number of TP and FN at these locations. In both cases, the TPR is 0.77, portraying that the number of FN is relatively low when compared to the number of FP. Our results show that landslides in these nine locations are generally associated with moderate warning levels. The main difference between the LEWS outputs with the two configurations is the number of FP (Table 1). Generally, a higher number of FP has been computed when running the LEWS with the I-D thresholds than when employing the I eq -VWC thresholds. The decrease in the number of FP is generally associated with moderate-intensity rainfall events that affected a high susceptibility area when the soil was relatively dry. With both configurations, the FAR values are relatively large. However, a higher number of FP have been computed when running the LEWS with the I-D thresholds than when employing the I eq -VWC thresholds. Resulting in an 18% reduction in the FAR when using the I eq -VWC thresholds.
Subsequently, the LEWS performance at the location of all 50 landslides of the verification inventory has been analysed. Table 3 summarises the verification results. Generally, the same number of TP and FN is obtained when running the LEWS with the two types of thresholds, but a higher number of FP has been computed when applying the I-D thresholds than when employing the I eq -VWC thresholds.
At the sites with high susceptibility, both configurations of the LEWS (using I-D and I eq -VWC thresholds) have the same number of TP and FN. High TPR and FAR values are achieved with both LEWS configurations. In both cases, the TPR is 0.71 portraying that the number of FN is relatively low when compared to the number of FP. The FAR slightly decreases when applying the I eq -VWC thresholds.
However, a considerable number of FN are computed at the location of landslides in moderate and low susceptibility terrain with both LEWS configurations resulting in low TPR. These FN partly correspond to "Low" warnings during low-intensity rainfall events. Some FN could be due to sediment deposition on roads or landslides triggered during rainfall events by different processes, such as river erosion at the toes of the slopes. Finally, false negatives in moderate and low-susceptibility terrain could also be partly attributed to landslides triggered in engineered slopes (such as road cuts), which are not well represented in the used susceptibility map.
Finally, it is worth noting that the results from the verification using the entire inventory at the sites with a high susceptibility are similar to those obtained from the LEWS verification at the location of the nine "high-quality" landslides. This outcome is expected as seven of the nine high-quality landslides were triggered in highsusceptibility terrain.

Analysis of the warning level time series at selected locations
To investigate the reason for the differences in the number of FP computed with the LEWS when using the two types of thresholds, we conducted a detailed analysis of warning level time series from April to December 2020 with two different datasets: (i) the nine high-quality landslides within the verification inventory and (ii) the recoded debris flows at the Rebaixader and Portainé debris flow monitoring sites (Hürlimann et al. 2014;Palau et al. 2017).
Regarding the results obtained at the locations of nine highquality landslides within the verification inventory, we selected site A and site I (see Fig. 2 for location) to show an example of the LEWS outputs obtained when applying the two types of thresholds ( Fig. 10). At these two sites, the terrain susceptibility is "High".

Original Paper
From 20 to 21 April 2020, site A was affected by an intense rainfall event (blue line in Fig. 11a). A significant landslide was triggered the 21 April around 20:00, impacting a road which remained closed for almost 24 h. At that time, the recorded rainfall intensity was above the first I-D threshold. Thus, with the rainfall-only approach, the LEWS has been able to compute moderate warnings. Two additional warnings have been given with the rainfall-only approach, one in September 2020 and one in December 2020. However, only the April 2020 landslide was reported. These two extra warnings are therefore considered false positives.
With the I eq -VWC thresholds, the LEWS has also been able to compute a moderate warning at the time the April 2020 landslide was reported. In contrast to the rainfall-only approach, only one FP has been given (Fig. 11a). The LEWS incorrectly issued a moderate warning during the December 2020 rainfall event when the VWC was above 0.4 m 3 /m 3 (orange line in Fig. 11a). With this VWC, the first I-D and I eq -VWC thresholds are very similar (Fig. 7) and require relatively low intensities to be overcome. The LEWS has correctly given no warning during the rainfall event in September 2020. This fact can be explained by looking at the VWC (orange line in Fig. 11a) which during the rainfall event in September 2020 was around 0.23 m 3 /m 3 . With such a low VWC overcoming the first I eq -VWC threshold requires substantially higher intensities than overcoming the first I-D threshold (Fig. 7). Such intensities were not recorded during the September 2020 rainfall event. Consequently, moderate warnings have not been computed during September 2020 rainfall event with the hydrometeorological approach.
At site I, the LEWS has been able to compute a moderate warning coinciding with a moderate-intensity rainfall event (blue line in Fig. 11b) that triggered a landslide in April 2020 using the two types of thresholds. Three additional warnings have been incorrectly issued with both LEWS configurations, one in June, one in Fig. 11 Rainfall equivalent intensity (I eq ) (blue line) and volumetric water content (VWC) (orange line) time series at site A a and site I b (see Fig. 2 for location). The colour bars below the plots represent the daily warning level time-series obtained from running the LEWS using the rainfall-only thresholds (I-D) and the I eq -VWC hydrometeorological thresholds (HM). The red-dashed vertical line represents the time when the landslide was observed according to the verification inventory Landslides 20 • (2023) September, and one in October 2020 (Fig. 11b). In September 2020 and October 2020, "Moderate" warnings have been computed with the two types of thresholds. However, in June 2020, a "Moderate" warning has been given with the rainfall-only I-D thresholds and a "High" warning with the hydrometeorological thresholds. The intensity of the June 2020 rainfall event was around 80 mm/h and the VWC was around 0.35 m 3 /m 3 . For these hydrometeorological conditions, the second I eq -VWC threshold requires slightly lower intensities to be reached than the second I-D threshold. Therefore, the given warning level with the I eq -VWC thresholds was in this case higher.
At the Rebaixader and Portainé debris flow monitoring sites, the terrain susceptibility is also high. At the Rebaixader, two large debris flows were registered on 6 June 2020 and 12 July 2020, with estimated volumes of 10,000 m 3 and 9600 m 3 . Additionally, two debris floods that mobilised smaller volumes were recorded on 14 July 2020 and 28 August 2020. At Portainé, two debris flows were recorded, on 12 August 2020 and 28 August 2020, and one debris flood on 10 July 2020.
With the two types of thresholds, the LEWS computed moderate warnings coinciding with the time of the two debris flows and the July debris flood recorded at the Rebaixader (Fig. 12a) and the 28 August 2020 debris flow at Portainé (Fig. 12b). The 28 August 2020 debris flood at Rebaixader and the 10 July 2020 debris flood and the 12 August 2020 debris flow at Portainé were triggered by lowintensity rainfall events. Neither of the two approaches was able to distinguish any of these three events. Finally, one moderate FP was computed during the month of July at the Rebaixader catchment when applying the I-D thresholds. Any FP has been given with the I eq -VWC thresholds at the two monitoring sites.

Conclusions
The presented study introduces a new method for the Catalonia region LEWS that replaces rainfall-only thresholds with empirical hydrometeorological thresholds accounting for both rainfall and soil moisture.
The employed rainfall input of the LEWS consists of KED rainfall estimates that combine rain gauge observations and radar measurements. Soil moisture information consists of modelled VWC from the EFAS runs of the LISFLOOD model. With the hydrometeorological thresholds in mind, we have gathered data from different sources to compile two inventories that are as exhaustive as possible. Finally, the calibration inventory contains 603 entries and the verification inventory includes 50 landslides. Both inventories mainly contain information on landslides that have caused impacts during severe rainfall events. Small landslides and reactivations that have not produced damages are usually unreported.
Nearly two years of rainfall and soil moisture information at the location of the landslides of the calibration inventory have been analysed. As expected, our results suggest that generally higher I eq values were required to trigger landslides when the VWC was low and lower I eq were needed when the VWC has high. However, we have not identified a clear correlation between the I eq and VWC of landslide events. Consequently, adjusting the hydrometeorological I eq -VWC thresholds has been challenging. Partly because of the limitations of the calibration inventory and partly because the rainfall I-D thresholds might already provide a relatively good classification between the rainfall conditions that can trigger a landslide and those that cannot.
Finally, the hydrometeorological thresholds have been obtained manually using the results of ROC analysis as a guidance. The three proposed I eq -VWC thresholds have been designed to be applied in the Catalonia LEWS using the chosen rainfall and VWC products as inputs and in combination with susceptibility information. Each of the three I eq -VWC thresholds has been determined to distinguish the equivalent intensity and VWC conditions that can lead to landslides in areas with a specific susceptibility. Using alternative VWC products such as VWC measurements at monitoring stations could result in overestimating or underestimating the hydrometeorological hazard. In that case, the thresholds must be checked and adapted.
The results of running the LEWS over a nine-month period with the rainfall-only as well as the hydrometeorological configuration show that moderate and high warnings are computed more frequently with the rainfall-only (I-D) thresholds than with the hydrometeorological (I eq -VWC) thresholds. This difference is especially significant for high-susceptibility areas affected by moderate-intensity rainfall events when the VWC is low. The LEWS performance has been analysed quantitatively only at the locations where the 50 landslides in the verification inventory were reported. A significant number of FN have been computed at these landslide sites in moderate and low susceptibility terrain using the two types of thresholds. FN can be partly due to uncertainties in the recorded process type and partly due to landslides triggered in engineered slopes, which are not well represented in the susceptibility map. However, our results indicate that at the location of landslides in moderate and high-susceptibility terrain modelled VWC can help improve the performance of regional-scale LEWS by reducing the number of FP. This agrees with the findings of Segoni et al. (2018), who employed estimated mean soil moisture from a rainfall-runoff model, and Pecoraro and Calvello (2021), who used soil moisture information from sensor measurements in prototype versions of regional-scale LEWS.
In Catalonia, the network of soil moisture sensors is sparse and cannot represent the soil moisture conditions well over the whole region. Thus, employing daily modelled VWC is a suitable alternative to cover the entire LEWS domain. From an operational point of view, an advantage of the EFAS VWC simulations over satellite products is that EFAS simulations are available in real-time on a pan-European scale. In contrast, many satellite products are updated with delay (Beck et al. 2021). Therefore, the EFAS VWC simulations could also be applied to improve the performance of real-time LEWS in other regions of Europe where the soil moisture sensor network is sparse.
It must be remembered that the uncertainties in the LISFLOOD model may be transferred to the LEWS. In this regard, Rosi et al. (2021) suggested relying on rainfall-only information to describe the intensity of the triggering rainfall event and the antecedent rainfall conditions. However, our results support the outcomes of other studies (e.g., Wicki et al. 2021) and show that including modelled VWC information improves the representation of hydrological processes and leads to a better representation of landslidetriggering conditions. Moreover, the proposed hydrometeorological thresholds use VWC information mainly to distinguish between dry and wet soil conditions. Hence, systematic errors or biases in VWC quantification are not very critical.
The results of our study suggest that modelled VWC can be used to improve the performance of regional-scale LEWS in areas with a sparse network of soil moisture measurement stations. In this regard, our outcomes are encouraging. However, a quantitative evaluation of the LEWS performance during long periods, for which an exhaustive inventory is available, is required to confirm our results and improve the preliminary I eq -VWC thresholds obtained in this study. It would also be helpful to corroborate whether the hydrometeorological thresholds can improve the LEWS performance in moderate and low susceptibility terrain.
Future developments in the Catalonia region LEWS should focus on reducing the number of FN. This could for example be accomplished by improving the susceptibility map to capture engineered slopes along linear infrastructures. Finally, landslide inventories play a key role in the obtention of the susceptibility map, the calibration of rainfall and hydrometeorological thresholds, and the verification of the LEWS. For this reason, we recommend continuing the efforts to collect more comprehensive landslide inventories containing accurate space and time information, recurring landslide events, and detailed descriptions of the triggering mechanism and volume.