Preparing first-time slope failures hazard maps: from pixel-based to slope unit-based

In this work, we present a novel quantitative geographical information system-based procedure to obtain the magnitude (area) and frequency of medium to large first-time shallow slope failures. The procedure has been set up at the Barcedana Valley, in the Tremp Basin (Eastern Pyrenees). First, pixel-based susceptibility classes were defined using a slope stability index obtained with the physically based model SINMAP. The frequency calculated from the number of first-time failures recorded during the last 60 years was then assigned to each susceptibility class. We devised a procedure to estimate the size of potential failures by means of the aggregation of pixels within the boundaries of morphological slope units, optimized for the purpose. Finally, the landslide hazard was prepared using the magnitude-frequency matrix. Results show that a proper pixel clustering has been carried which avoids the generation of small groups of pixels with different susceptibility degrees within the same slope unit. For a given hill slope, the area of the cluster of pixels depends on the size of the slope unit, which is not unique as it depends on the criterion used to delineate them. Therefore, the latter is a key factor in the final results. In this study, we validated our results with the size distribution of the observed landslides. The methodology presented in this work can be applied using any susceptibility assessment method with a pixel-based output.


Introduction
Magnitude and frequency are the two key factors for a quantitative assessment of landslide hazard Fell et al. 2005;Picarelli et al. 2005;Van Den Eeckhaut et al. 2007;Corominas et al. 2014;Lari et al. 2014;Vranken et al. 2014). Thus, knowledge of the probability of occurrence of landslides of a given magnitude is mandatory to obtain landslide hazard maps. In this work, the landslide magnitude is expressed as the area (number of pixels) affected by the slope failure.
The concept of intact slopes is not straightforward as strictly speaking, they do not exist in nature: evidences of slope processes acting in the past are not present after a few years due to the erosive cycle. From a geotechnical point of view, a slope can be considered as intact if it mobilizes peak strength conditions. Geomorphologically speaking, intact slopes display undisturbed strata and show consistent dip angles in contrast with existing landslide bodies. The latter can be characterised by undulated or hummocky ground surface, presence of scarps, open cracks, tilted trees and disturbed soils and rocks. The stability of intact slopes can be assessed by means of physically based, statistical, analytical or heuristic approaches (Aleotti and Chowdhury 1999;Das et al. 2011;Ghosh et al. 2012). Some physically based models calculate the safety factor (SF) that can be integrated in the landslide hazard assessment (Wu and Abdel-Latif 2000;Frattini et al. 2004;Haneberg 2004Baum et al. 2005. Advanced approaches couple the hydrological model and the infinite slope stability model and both can be implemented with various degrees of sophistication, including either steady-state or transient conditions (Montgomery and Dietrich 1994;Savage et al. 2004;Baum et al. 2008;Godt et al. 2008;Salciarini et al. 2008;Rossi et al. 2013;Alvioli et al. 2018). The first physically based models were twodimensional (2D), but nowadays, three-dimensional (3D) slope stability models exist (Marchesini et al. 2009;Jia et al. 2012;Mergili et al. 2014;Raia et al. 2014;Alvioli and Baum 2016). Many physically based models are implemented within a GIS, or are supported by GIS processing, and calculate the SF for single pixels. Meaningful combinations of the spatially distributed SFs allow to define 3D geometries of instable portions of the terrain (e.g. Marchesini et al. 2009) or evaluate the SF for the most critical slip surfaces (e.g. Jia et al. 2012;Mergili et al. 2014). More sophisticated models incorporate a probabilistic package that calculates the probability of failure (probability of SF being smaller than unity) by evaluating the different slip surfaces as well as the uncertainty in the input values (e.g. Pack et al., 1998;Rossi et al., 2013). Another example, for the case of shallow landslides, is the integration of the TRIGRS model Alvioli and Baum, 2016) with SCOOPS3D (Reid et al. 2015), where the latter uses the timedependent infiltration model results as calculated by the former and performs an additional 3D analysis (Tran et al. 2018), providing an overall 3D, time-dependent analysis. Physically based models work at pixel level by design, and the interpretation of the results at slope level needs to be addressed in a meaningful way (e.g. Alvioli et al. 2014;Bellugi et al. 2015). Physically based models also require higher resolution data, with respect to statistical ones, typically geotechnical parameters of the soil. Such data is often difficult to obtain on large areas, and for this reason, probabilistic approaches have been introduced, for example using TRIGRS in Raia et al. (2014) and Salciarini et al. (2017). Another limitation posed by the physically based models accounting for water infiltration is the dependence on the initial water table depth (Baum et al. 2005), which is difficult to estimate and needs to be measured on the field. An evaluation of the different options for data gathering for physically based models, specifically for TRIGRS, was recently described in Yatheendradas et al. (2019).
The SF output of physically based models can be transformed into a stability index (SI), for example by means of the Monte Carlo method, providing multiple SF analysed statistically. The SI, which can be used as a proxy for susceptibility (spatial probability of landslide occurrence) due to its statistic component, can be empirically converted into annual probability by correlating it with the temporal frequency of observed failures: the larger the temporal window of the inventory, the better the conversion will be. However, the SI resulting from 2D physical models is pixelbased, and if one does not resort it to explicit 3D methods, it is difficult to go beyond the pixel-based scale. In other words, a main restriction of the pixel-based analysis is the lack of correspondence between the landslide size and the pixel size. In many landslide susceptibility maps carried out in clayey formations, the size of the landslide often does not match to the size of the pixel (landslides are either bigger or smaller than the pixel size) (e.g. Baum et al. 2005;Salciarini et al. 2008;Pradhan et al. 2018) and this may not reflect the actual occurrence of the slope failures.
Thus, when analysing a pixel-based output from a physically based model, one faces two main challenges: 1. Assessing the susceptibility of intact slopes. 2. Assessing the magnitude of failures bigger than the pixel size.
A procedure to overcome this restriction is by defining the physical boundaries of the hill slope where the potential failure can occur (e.g. Guzzetti et al. 1999Guzzetti et al. , 2006. In this case, any analysis carried out affects the whole slope unit in terms of uniform spatial probability of landslide occurrence within the same slope unit, and therefore, this approach is difficult to carry out in large slopes showing variable steepness as the failure may only affects a small portion. In this work, we have assumed that failures cannot be bigger than the physical boundaries of the hill slope. We have devised a procedure to obtain the area of failures bigger than the pixel size and constrained by physiographic boundaries. The latter are commonly referred to as terrain units (TUs).

Terrain units
TUs are portions of terrain with similar geological and geomorphological features which divide a region in portions that have a set of common properties, different from the adjacent ones, across definable boundaries (e.g. Hansen 1984;Lee and Evangelista 2006;Pavel et al. 2008). In other words, they maximise the internal homogeneity and the between-unit heterogeneity (Jia et al. 2012;. TUs embody the geomorphologist assumption that both geological and geomorphological features properly describe the boundaries resulting from the relationship between materials, landforms and processes. Since landslides occur in sites with a certain slope gradient, the failure is constrained between the highest and the lowest part, i.e. between the divide and the drainage line of a given slope, respectively. Slope units (SUs) are a particular type of TUs causally related to the hydro-and geomorphological conditions and processes that shape natural landscapes, including landslides. Each SU is encompassed by a drainage (or valley) and a divide line (Carrara 1988;Carrara et al. 1991Carrara et al. , 1995Carrara et al. , 2003Giles 1998b;Guzzetti et al. 1999;Xie et al. 2004;Komac 2006;Guzzetti et al. 2006;Copons and Vilaplana 2008;Reichenbach et al. 2014). SUs are a formalization of what geomorphologists describe as slopes, and failures take place within their boundaries (Guzzetti et al. 2000).
SUs are known to be the most appropriate terrain partitioning for hydrological and geomorphological studies, including landslide susceptibility modelling and zonation (Carrara et al. 1991(Carrara et al. , 1995Guzzetti et al. 1999Guzzetti et al. , 2006 and landslide hazard assessment (Xie et al. 2003;Giles 1998a, b;Van Den Eeckhaut et al. 2009;Jia et al. 2012). In the literature, the use of grid cells is often preferred to other types of mapping units, including slope units. This is largely due to the simplicity of using regular grid cells, in conjunction with a gridded digital elevation model, with respect to irregular polygons. Most of the existing GIS software can perform complex operations on a pixel basis, while calculating quantities within slope units requires some additional effort. Moreover, the very delineation of slope units is not straightforward, especially on large areas. Nevertheless, using a meaningful slope unit delineation has several advantages for a number of applications. Signatures of geomorphological processes are uniform within slope units, and maps produced on the basis of slope units are inherently more suitable for land use planning and hazard management, since their boundaries are recognizable in the field. Moreover, for the specific purpose of this study, pixels are not suitable to define the magnitude of potential landslides, simply because the stability model does not provide a means to go beyond the pixel-based description. Hence, SUs are a valuable tool to assess the magnitude of failures larger than the pixel size.
Beyond pixel-based susceptibility using slope units Figure 1a shows the results of a pixel-based landslide susceptibility analysis. Although one is not entitled to extrapolate the result outside of each pixel's spatial domain, the presence of a cluster of high-susceptibility pixels (orange colour) suggests that, in case of failure, it will involve several pixels at a time. As the analysis is performed at pixel scale, there is no connection with the other adjacent pixels as it happens in nature, the stress state of a portion of terrain actually being modified by changes in the stress state of an adjacent one, which can lead to instability.
A rough way to account for failures larger than a pixel is by assigning a unique susceptibility value to the whole SU. This can be done, for instance, by averaging the susceptibility values within the SU polygons ( Fig. 1b) and assuming that the size of the SU defines the size of the potential failure. In the example of Fig. 1b, however, two main inconsistencies are observed: the average susceptibility obtained is low (yellow) and hides the fact that part of the slope is susceptible to fail. Moreover, if the spatial extent of the potential failure is smaller than the SU (orange in Fig. 1a), the area of the latter is not a proper proxy of the possible slope failure magnitude.
In order to overcome the disadvantages of averaging, we intend to work directly with the area of the pixel clusters defined within the same SU, which will be used as a proxy for the magnitude of the potential failure. The size of the clusters of pixels with the highest-susceptibility values are expected to collectively furnish a proxy of the estimation of the magnitude of a potential landslide in the SU: we expect that failures of different magnitude will take place within the clusters according to their probability of occurrence. To this end, a process of pixel clustering has to be developed.

Landslide hazard mapping
Landslide hazard can be expressed as the frequency of landslides of given magnitude within a certain area and a given time period; it is evaluated through different methods depending on the scale of analysis (Fell et al. 2008). According to the suggestions given by Fell et al. (2008), landslide hazard zoning assigns an estimated frequency to the potential landslides resulting from susceptibility mapping. Frequency is combined with the magnitude of potential instabilities to map landslide hazard by means of magnitude-frequency (MF) relations expressed as an MF matrix. Usually, these matrices are prepared for each landslide mechanism (Strozzi et al. 2013) and the hazard degree results from the combination of both input parameters. For instance, frequent smallsize landslides are not ranked as highly hazardous because their intensity and the subsequent damage are low and vice-versa. This topic has been widely discussed in countries such as Switzerland, Italy as well as Australia, among others, where custom methodologies have been developed (AGS 2007;Kranitz and Bensi 2009;Raetzo and Loup 2009) and implemented elsewhere (e.g. Cardinali et al. 2002Cardinali et al. , 2006Raetzo et al. 2002;Sterlacchini et al. 2007;Posch-Trözmüller 2010;Strozzi et al. 2013).
Landslide hazard can be expressed in qualitative or quantitative terms. Usually, qualitative assessment includes subjective criteria of experts and, therefore, a certain subjectivity in the interpretation of the threat posed by different landslides. MF matrices are usually presented in qualitative terms as low, medium, high and very high. However, if frequency and magnitude can be quantified, the different degrees of hazard can be quantified as well.
The objective of this work is to evaluate the potential for large slope failures (translational slides) of the intact slopes by applying a deterministic model in the Barcedana Valley (Eastern Pyrenees, Spain) and using the area of the most susceptible pixel clusters as a proxy of the magnitude. The details of the methodology and the study area are given in the following sections.

General setting
The study area is located in the Barcedana Valley (or Basin), in the southern part of the Tremp-Graus Basin, which is an erosive depression within the South Central Structural Unit (SCU) (Seguret 1972). The SCU includes, from north to south the Boixols, Montsec, and Sierras Marginales thrusts. The Tremp-Graus basin is located north from the Montsec thrust sheet and forms a wide syncline. Materials filling the Tremp-Graus basin are mostly claystones and sandstones from the Upper Cretaceous/ Palaeocene (Garumnian facies) and limestones and conglomerates from Eocene-Oligocene ages. Typical Mediterranean climatic conditions prevail in the area with intense storms at the end of spring and in the beginning of autumn and dry periods in January-February and early summer. The mean annual precipitation is about 600-700 mm mostly in the form of high-intensity storms (Novoa 1984), some of them reaching up to 100 mm in 10 h (approx.), and the mean annual temperature is 12.5°C with a mean maximum temperature of 27.6°C and a mean minimum temperature of 0.4°C.

Geological description
The Llimiana Range is formed by well-cemented materials from Oligocene: sandstones, conglomerates, reddish and grey lutites, limestones with alveolines and marls and lutites. To the south, the Montsec Range exhibits also well-cemented materials from Garumnian Facies: Micritic limestones, calcarenites and lutites of Fm La Posa (Maastrichtian); Sandstones and calcarenites of Fm Gresos d'Areny (Maastrichtian) and limestones, calcarenites and sandstones of Campanian-Maastrichtian age (Fig. 2b). Between both ranges, within the valley, a big extension of colluvium can be found. It is mainly composed of clays with angular sandy and silty clasts of Holocene age. Garumnian deposits are also present: red lutites, paleosoils, sandstones and gypsum of Selandian-Thanetian age, as well as micritic limestones, calcarenites and lutites of Fm La Posa (Maastrichtian).
Commonly, Garumnian deposits of the central Pre-Pyrenees are divided in three main parts from bottom to top (Rosell et al. 2001 Fig. 1 a Pixel-based susceptibility in two sample SUs (SU1 and SU2). b SU-based susceptibility, after averaging, within the same SUs 2. Lower Red Garumnian (red claystones): it is made of red silty materials with interbedded levels of lenticular fining upward sandstones. Between silty deposits, there exist thin layers of fine-grain sandstone with cross lamination. These silty deposits are often affected by joints and present calcareous nodules. The roof of the Lower Red Garumnian level has been defined as the limit between Mesozoic and Cenozoic. According to the geological map, Rosell et al. (2001) and field observations, the cartographic units from Garumnian facies, outcropping in the Barcedana Valley, have been classified as follows: 1-PPlg, which in some parts is covered by Qco, corresponds to the Upper Red Garumnian 2-KMlg corresponds to the Lower Red Garumnian 3-KMcp corresponds to the Gray Garumnian.
Vallcebre limestones do not outcrop in Barcedana Valley as they are either covered by quaternary colluvium or they are not present in this part of the basin. Materials from the Red Garumnian Facies (PPlg and KMlg) are the main responsible for the several numbers of landslides occurred, especially, during rainy events.
The most common type of landslides in the Tremp Basin are translational slides developed in the wide slopes of the region on clayey formations (Figs. 2b, c and 3a, b). These are low-plasticity claystones which include gypsum veins. Despite affecting large surface areas (< 5000 m 2 but up to more than 20,000 m 2 ), they are relatively shallow because the sliding surface develops at the contact between the disturbed (weathered) and the intact claystone. Most of the failures show depths ranging between 1 and 5 m. Locally, in case of slopes next to entrenched streams, the steep slopes may also generate rotational failures (Fig. 3c). Therefore, in terms of area, they can be considered medium to large landslides but not in terms of thickness. These landslides are mostly selfcontained and short runout (e.g. Carrara et al. 2003;Guzzetti et al. 2005). The debris sometimes accumulate in hollows and drainage ways and may eventually turn into slow-moving earthflow. Contrary to the typically clay formations, slope failures in the Tremp basin may be triggered due to shortlasting very intense rainfall events. The reason is that the most superficial part of the claystones has been unloaded by the erosion and presents cracks and macropores ( Fig. 3b) (Corominas 2001(Corominas , 2006. Furthermore, volumetric strain and cracks usually appear within the first meters below the ground surface due to moisture changes, thus favouring rainfall infiltration.

Data
Multiple remote-sensing images and maps acquired from various sources (Table 1) have been used to create a map of geomechanical units and landslides (Fig. 2c). A high-resolution DEM with a grid cell size of 5 m, a geological map and 11 orthophotos have been freely downloaded from the Catalonian Geological survey (ICGC). The aerial images are true-colour georeferenced orthophotos with a pixel resolution of 50 cm covering the period from 1956 to 2013.
The landslide map has been carried out through field surveys and stereoscopic analysis of orthophotos. Regarding the geological map, firstly, we used a 1:25,000 scale geological map prepared by the Catalonian Geological Survey (ICGC) and we checked it in the field at 1:10,000 scale. The geomechanical units were defined by the ICGC. The reclassification consisted in merging formations with similar lithology. As a result, six geomechanical units have been defined (Table 2) to carry out the physically based slope stability analysis. The hard rock unit (unit 1) outcrops at the southern part of the Barcedana Valley (the Montsec north face). It is mainly composed of well-cemented conglomerates and competent sandstones and limestones. During the field campaigns, we observed some large rockfalls that were generated in this unit. Interbedded weak and hard rocks (unit 2) outcrop at the northern of Barcedana Valley. They consist of competent limestones with alveolines and well-cemented conglomerates with interbedded sandstones and lutites. In some parts of the valley, they are quite fractured and frequently affected by rockfalls as well. The weak rock unit (unit 3) outcrops in the central part of Barcedana Valley, in the quasivertical gully walls and composed of claystones and siltstones. Locally, sandstones can be predominant. In other parts of the valley, it appears as a less competent material which, in some cases, can be disaggregated by hand. The coarse and fine colluvium units (units 4 and 5, respectively) are present in the central part of the Barcedana Valley. Coarse colluvium is mostly composed of angular clasts (centimetre and decimetre in size) embedded in a silty matrix, it is well graded and shows a chaotic internal structure. Fine colluvium is the most extended and landslideprone material in the Barcedana Valley. It is constituted by silty clay with low plasticity and is responsible for a high number of slope instabilities. The last unit, defined as slide material (unit 6), is found in the central part of the Barcedana Valley. It consists of soil deposits showing well-graded chaotic texture, as well as other landslide features such as presence of scarps, open cracks, hummocky ground and tilted trees. It is mainly composed of clayey and silty materials as well as clasts of several sizes (from centimetre to decimetre size). These materials present residual strength conditions (Canales 2011;Montero 2011;Oliveras 2011). Colluvium and old landslide deposits often overlay the claystones of the Garumnian facies (unit 3).
The inventory of first-time failures has been prepared by stereoscopic analysis of the orthophotos and direct field mapping (Fig. 2c). During the stereoscopic analysis, we first selected the orthophotos from the ICGC with sufficient resolution (Table 1) to identify the landslide bodies and indicators of activity, such as scarps and cracks. These observations were confirmed during the field trips where we could also identify other indicators of activity like fallen and disturbed trees, presence of water, intact and disturbed bedding, translational and rotational platforms and structures with slightly or severe pathologies. A total of 17 first-time large slope failures have been identified for the period 1956-2013 (Table 3). The thickness of the landslides has been estimated in less than 5 m. They are mostly translational slides with one case out of 17 showing a rotational component (Table 3). These types of landslides in the study area are typically triggered by short-lasting intense rainfalls due to the presence of cracks. To be certain of the firstfailure character, we checked that none of the mapped landslides were present in the older set of orthophotos (1956). It is worth noting that the lack of images with sufficient quality from 1956 to 1990 may lead to the omission of events occurred during this period due to the erosive cycle. The area of the failures range from 816 to 27,700 m 2 , and they mainly occurred on fine colluvium (53%), followed by coarse colluvium (29%) and weak rock unit (18%). Such landslides have occurred in quite gentle slopes, ranging from 8°to 37°with a mean slope angle of 17°( Canales 2011).

Methodology
Establishing a relation between the outcomes of physically based models, as the one adopted in this work, and our ultimate aim of preparing hazard maps for large first-time failures that exceed the pixel size of 5 m (DEM resolution) consists of several steps. The first step of our method is the susceptibility assessment through the calculation of the SI for each pixel using a slope stability model coupled with a hydrological model (SINMAP) (Pack et al. 1998). Then, to constrain and to calculate the size of the potential failures, larger than the pixel size, we delineated SUs using an existing software and re-classifying the SI map within each SU polygon, using a novel pixel clustering procedure. The ultimate goal of our approach is to consider the sizes of the high-susceptibility pixel clusters as a measure, or a proxy, of the magnitude of potential landslides in intact slopes. The temporal component of landslide hazard (i.e. frequency) was estimated using the probability of the first-time failures from the inventoried instabilities. Eventually, hazard was calculated using a MF matrix considering the mentioned magnitude and frequency.
Susceptibility assessment using a physically based model We used the physically based model SINMAP (Pack et al. 1998), a software developed for the analysis of slope stability at regional scale in a GIS environment. The software is based upon the infinite slope stability model coupled with a simple hydrological model, TOPMODEL (Beven and Kirkby 1979), which calculates the topographic wetness index w for each pixel of a digital elevation model (DEM). It considers the Mohr-Coulomb criterion to calculate the SF as follows: in which C dl is the adimensional cohesion, ρ w [kg/m 3 ] is the water density, ρ s [kg/m 3 ] is the soil density, φ [°] is the friction angle, and θ [°] is the slope angle. The adimensional cohesion is given by where C [N/m 2 ] is the soil cohesion, g = 9.81 [m/s 2 ] is the acceleration of gravity, and h is the height [m] of the soil column in the given pixel. The hydrological model included in SINMAP calculates the wetness index w starting from a few assumptions: 1. Surface runoff follows topographic gradient. This assumption allows calculating the specific catchment area at each pixel as: where a is the specific catchment area [m], A is the contributing area [m 2 ], and b is the unit contour length (pixel size) [m].
Considering assumption 3, the wetness index w can be calculated as follows: SINMAP does not require a single value for the input parameters, and it also accepts a range of values, in case the parameter values are not uniform or the exact value is not well constrained. This uncertainty about the input parameters permits to use the binary deterministic approach (FS = < 1, failure; FS > 1, non-failure) as a proxy for susceptibility (probability of having a landslide in a given area) through the stability index (SI). For FS min > 1, the SI is the FS min (SI = FS min ). For areas where FS min < 1, the SI = Prob (FS > 1). In case FS max < 1, then SI = Prob (FS > 1) = 0. The numerical values of input parameters used for applying SINMAP in this research (C dl , φ and T/R) are listed in Table 4 and discussed in more detail in Domènech (2015). They have been obtained from different field campaigns carried out in the study area (Canales 2011;Montero 2011;Oliveras 2011;Domènech 2015) and from the literature (Coduto 1999;González de Vallejo et al. 2002). For each unit, the same h soil value has been assigned to all the pixels where it outcrops. Pixels where units 1 and 2 outcrop correspond to hard rock and interbedded weak and hard rock (Table 2), and they require another specific analysis for rockfalls.
Finally, the SI is reclassified into four susceptibility classes of increasing probability of failure according to the observations made in the field. To this aim, the classification of the SI values has been modified from that proposed by Pack et al. (2005), using the following classes: 0 < SI < 0.25 (high susceptibility); 0.25 < SI < 1.0 (medium susceptibility); 1.0 < SI < 1.5 (low susceptibility); 1.5 < SI < 10 (very low susceptibility). Results of SINMAP have been validated by a receiving operating characteristics (ROC) analysis (Fawcett 2006).

Parametric delineation of slope units
Although a unique SU size for a given hill slope does not exist, in mountainous areas, SUs can be easily identified, as compared to more gentle slopes areas. In the latter, the topography has not been sufficiently shaped by the erosive processes and, therefore, the drainage lines and the subsequent catchments to be later partitioned into SUs are more difficult to delineate and several possibilities exist. In the latter, it might happen that the morphodynamic processes trespass the SU limits. In such cases, it has to be decided whether the landslide extension prevails over the SU or vice versa.
Our starting point is that the observed first-time slope failures in the field should not exceed the size of the SU and conversely, they should not be too small in relation to the size of the SU. To this end, we have first carried out a preliminary delineation of SU to achieve a correspondence between the size of the SU generated and the size of the observed slope failures.
SUs can be drawn manually, which is an intrinsically error-prone and subjective procedure, or using specialized software (Xie et al. 2004). The basic ingredient to obtain a meaningful SU partition is a hydrological tool to delineate both drainage and divide lines (Fig. 4): an SU is defined from the catchment, which is generated using a drainage network characterized by a given value of flow accumulation threshold. Such flow accumulation threshold is also known as contributing area threshold (CAT). Conversely, pixels with flow accumulation value smaller than the CAT will not be considered as part of the drainage line. One of the existing approaches to draw SUs is the one described by Xie et al. (2003), who used the Arc Hydro tool (David 2002) to reverse the DEM, with the effect of exchanging the original drainage lines into divides and vice-versa. On the other hand,  used the algorithm of Metz et al. (2011) to define simultaneously drainages and divides and considered the average aspect direction of each portion of terrain encompassed by such landscape boundaries as a measure of homogeneity, as required by the very definition of SUs.
In this work, we used the software r.slopeunits of  for SU delineation. They assumed that no unique SU delineation exists, since there is an inherent dependence on the scale and nature of the process under investigation. Non-univocity implies no definite SU size and different degrees of homogeneity of aspect, and other morphological properties, within each SU. The r.slopeunits software accepts a number of input parameters, which control the size and shape of the output SU map. The numerical values of the input parameters must be provided by the user based on his/her expertise or on some objective parameter selection method. The SUs boundaries delineation also depends on the DEM resolution (Mashimbye et al. 2014;Schlögel et al. 2018). The algorithm of  is implemented in the GRASS GIS (Neteler and Mitasova 2008) module r.slopeunits freely available from http://geomorphology.irpi.cnr.it/tools/slope-units. In addition to the DEM, relevant inputs of the method implemented by the software are two parameters that drive the SU delineation process: (i) the circular variance, c, representing the degree of slope aspect homogeneity required within each SU; (ii) the minimum area, a [m 2 ], i.e. the threshold size under which a polygon is considered as an SU. Other input parameters are required for using the software; their role in the SU delineation is more technical and it is described in detail in ; thus, we focus here only on the determination of suitable c and a parameter values, which are directly related to the local terrain morphology.
The r.slopeunits software has been run for many combinations of the values of the input parameters a and c, namely c = (0.01, 0.05, 0.1, 0.15, 0.2, 0.4) for the circular variance threshold and a = (5000, 10,000, 25,000, 50,000, 100,000, 200,000, 500,000) m 2 , which determined 42 sets of SU, obtained from all the possible (a, c) combinations. The choice of the ranges of variation of input parameters was not dictated by specific criteria, but only on heuristic judgement based on a brief preliminary analysis, and they are similar to the ones used in previous work Bornaetxea et al. 2018;Schlögel et al. 2018). Each SU delineation is a cover of the study area, with polygons of different shapes and sizes, but all of them represent valid hydrological subdivisions of the terrain. For example, it is intuitive that the size of SU is larger for sets produced using larger values of the parameter a. Similarly, larger values of the parameter c require SU with large terrain variability that is likely found within larger polygons. A finer control of the output of the r.slopeunits software as a function of the whole set of input parameters is virtually impossible, since it depends from the complex interplay of individual parameter values and local terrain characteristics.
Re-assessment of susceptibility values within the SU: magnitude of the potential failure Results of the SINMAP are in terms of a pixel-by-pixel mosaic of different SI values (Fig. 5a), which are not suitable to our goal which is the generation of larger clusters of pixels that may explain the occurrence of big failures. (Fig. 5b).
It is important to stress that the outcome of the clustering depends on the SU delineation: the SU number, size and boundaries are different for each of the considered (a, c) combination. For the purpose of this work, the cumulated frequency of the mapped first-time slope failures and the predicted ones (highsusceptibility cluster of pixels) are used to optimize SU parameters for better agreement with the distribution of the inventoried landslides. In such case, we are assuming that the landslide frequency size distribution will not change over time, which is not completely true, especially in areas affected by a strong disturbance such as large earthquakes (Fan et al. 2018). Therefore, it is advisable to update the model according to the local conditions. In this study, a custom metric has been defined to establish which SU delineation produces the best result. It is a positive calibration that calculates the difference between measured and modelled values and is defined as follows: where 17 is the number of observed first-time slope failures in the study area within a 57-year period, D i is the ith point of the observed distribution and P i (a, c) is the corresponding ith point of the predicted distribution obtained with the specified values of (a, c), among the 42 different SU delineations. It is worth noting that an optimal SU map with respect to aspect segmentation can also be selected by finding the maximum of the general-purpose objective function proposed by Espindola et al. (2006) and adapted for dealing with aspect segmentation in .
Eventually, the magnitude of the potential failures was evaluated considering the size of the pixel clusters belonging to the highest susceptibility class and located within the same SU, which are the most likely to fail.

Assessing the probability of the first-time failures
The frequency of first-time failures for a cluster of pixels of a given susceptibility class was calculated using the inventory of failures identified both in the field and by a stereoscopic analysis of orthophotos ( Fig. 2c; Table 3). From the landslide inventory, we calculated the number of first-time slope failures N i in each susceptibility class over the total number of first-time slope failures N T . Then, the ratio has been divided by the observed period of time P and normalised by the area occupied by each susceptibility class A i . The frequency is expressed in terms of the number of first-time slope failures per year and per square kilometre as follows: And it has been divided into four classes defined in qualitative terms (e.g. very low, low, medium and high).

Fig. 5
Re-assessment of susceptibility values. a Pixel clusters of the same susceptibility, whose area is smaller than a given threshold previously defined. b Reclassified clusters, merged within the bigger adjacent susceptibility values located within the same SU

Original Paper
The hazard has been evaluated by combining the frequency and magnitude, obtained in the previous sections, for each cluster of pixels located within a given SU. For each combination, the degree of hazard has been established and expressed in qualitative terms, too. The number of classes and the definition of each one have been established according to the Swiss Federal Guidelines (Raetzo et al. 2002): very low, low, medium and high hazard. We assume that the failure affects only the cluster of pixels.

Results and discussion
The output of SINMAP is presented in Fig. 6 together with the mapped first-time slope failures. Visually, most of the failures are found in the high-susceptibility classes, although some sparse pixels of lower susceptibility are also present. Results performed by the ROC analysis (Tables 5 and 6) indicate a good performance of the model with an area under the curve (AUC) of 0.87.
In our study area, large landslides are located in gentle slopes where the delineation of the SUs is not trivial; therefore, to analyse the effect of the SUs and their size, a few different cumulative frequency distributions have been built for (i) the observed (mapped) first-time failures, (ii) the raw high-susceptibility adjacent pixels predicted by SINMAP before pixel clustering, (iii) the same high-susceptibility adjacent pixels after pixel clustering without using SUs and (iv) using SUs (the latter resulting in 42 different distributions). The corresponding results are presented in Figs. 7 and 9. Figure 7 shows the cumulated frequency distributions of the pixels and areas in the most susceptible class, whose leftmost point is normalized to 100% by definition. In general, the frequency size distribution of the susceptibility map obtained directly from SINMAP includes a higher number of small clusters, as expected (Fig. 7a). Concerning the pure clustering process (without SUs), e.g. in the case of a flat area, where the topography of the terrain does not constrain the landslide area, the trend is the same as the results obtained directly from SINMAP (Fig. 7a). However, by  Table 5 Validation of SINMAP by a receiving operating characteristics (ROC) analysis comparing the model results versus observed landslide presence. We converted the probabilistic result to 0/1 using a threshold (column Th.). We show true positives, false negatives, true negatives and false positives, along with hit rate and false alarm rate, as a function of the threshold. Plotting HR versus FAR one gets the ROC curve. A value of area under the curve (AUC) of 0.87 has been obtained by means of the R function ( removing the clusters smaller than the given threshold (800 m 2 ), the frequency of the remaining ones increases considerably. Another important factor is the size of the SUs. For clarity, a few SU sets covering the whole range of frequency size distributions have been selected (Fig. 7b). The SU sets denoted by "set 2", "set 5", "set 10", "set 20", "set 35" and "set 42" have been obtained respectively with the (a, c) combinations: (10,000 m 2 , 0.01), (100,000 m 2 , 0.01), (25,000 m 2 , 0.05), (200,000 m 2 , 0.1), (500,000 m 2 , 0.2) and (500,000 m 2 , 0.4). The slopes of the six distributions clearly indicate that the smaller the SU polygons (roughly proportional to the values of both the a and c parameters), the smaller the sizes of the high-susceptibility pixel clusters and vice versa. It is worth nothing that most of the curves nearly fit with the pure clustering distribution (grey line) for areas between 1000 and 10,000 m 2 , i.e. the smaller clusters of pixels given in the pure clustering process (grey line) are not affected by the SUs. From these results, we can already state that the SUs and their size (i.e. the type of relief of a given area) are a key factor that controls landslide size.
The results of the custom metric defined in Eq. 7 (the cumulated frequency of the mapped landslides and the predicted ones are used to optimize SU parameters for better agreement with the distribution of the inventoried landslides) and the objective function proposed by Espindola et al. (2006) and adapted in  are shown in Fig. 8. Such objective function is shown in Fig. 8b along with the F(a, c) metric of Eq. 7 (Fig. 8a). We note that the present optimization procedure has selected an SU delineation which does not provide optimal aspect segmentation, as a result of the peculiar metric used in this work, but it is still in the region of acceptable values of the SU to be a meaningful partition of the terrain as far as the aspect homogeneity is concerned.
The distribution obtained by the set of SUs corresponding to the minimum of F(a, c) is shown in Fig. 9 with a yellow solid curve, denoted by "set 12", and it has been obtained by (a, c) = (100,000 m 2 , 0.05).
The temporal probability of the first-time slope failure for each susceptibility class has been calculated and normalised by the total area occupied for each susceptibility class using Eq. 8. For each class, the area of intact slopes in 1956 has been calculated and used to normalize the frequency of each susceptibility class. Since the number of pixels falling in a given susceptibility class depend on the aggregation process and therefore on the SU size, the frequency for a given susceptibility will be dependent on the SUs size as well. As afore-mentioned, a unique SU size does not exist, particularly in gently sloping areas like our study site. Hence, a previous analysis to evaluate the influence of the SUs size on the frequency of the first-time failures for a given susceptibility has been performed. A group of four types of SUs afore-used to analyse the influence of the SUs in the frequency size distribution has been taken ( Fig. 7b: SU2, SU20, SU35 and SU42). As shown in Table 7, for a given susceptibility class, the area of intact slope does not change so much (maximum variation of 0.14 km 2 over a maximum of 8.59) and the final frequency fluctuation is about 4%. Hence, the influence of different SUs on the time dependence can be neglected. The results of Table 7 are consistent with the susceptibility class of the clusters of pixels. The frequency of the first-time slope failures in the highest susceptible class (SC4) is one order of magnitude bigger than the frequency in the mid susceptibility class (SC3). Additionally, the null values of frequency obtained for the two lowest susceptibility classes (1 and 2) and the increasing for the susceptibility class 3 and 4 have confirmed the reliability of SINMAP and the reclassified SI values. The fact that the susceptibility classes 1 and 2 show the same frequency (0) is a matter of the small number of first-time slope failures present in our study area. Table 6 Validation of SINMAP by a receiving operating characteristics (ROC) analysis for high and medium susceptibility classes. Number of pixels classified as true positive (TP), false positive (FP), false negative (FN), true negative (TN) and the resulting true positive rate (TPr) and false positive rate (FPr)  However, the distinction between both classes is important as in the stability analysis they showed a different susceptibility. Finally, the MF matrix has been prepared considering the degrees of hazard of the Swiss Federal Guidelines (Raetzo et al. 2002) and it is shown in Fig. 10. Large translational landslides are usually very slow movements in which large failures may imply effects onto larger area of the territory rather than an event with higher velocity. Therefore, frequency was considered to prevail over magnitude when evaluating the degree of hazard for events with low and very low frequency. For medium and high frequencies combined with low magnitude (index 13 and 14 in Fig. 10), the degree of hazard is considered to be low and medium, respectively, as the size of the failures will be considerably small (< 5000 m 2 ). However, for events characterized by a medium frequency and high magnitude (index 33, Fig. 10), the large size and thickness of the movements, and its high recurrence, suggests that the damage produced to the buildings may be more severe.
The hazard map of the whole area after the automatic pixel aggregations, using the optimal SU (set 12 in Fig. 9), and the application of the MF matrix is presented in Fig. 11a. The comparison between the reclassified susceptibility map obtained from SINMAP, the susceptibility map after the automatic pixel aggregation and the hazard map are shown in Fig. 11b-d, respectively. The figure shows a south and south-east-facing slope, prevalently with susceptibility in the highest class (Fig. 11b). However, some pixels with medium, low and very low susceptibility (indicated with black arrows) have been calculated by SINMAP in the middle of such slopes. Obviously, if a landslide hazard mapping is performed, these small clusters (area < 800 m 2 ) showing susceptibility values smaller than the surrounding pixels will not be considered as less susceptible, but as an artefact due to the pixel-based nature of the SF. Therefore, the susceptibility has been homogenised along the slope by means of the SU-dependent clustering procedure devised in this work, and the area of the adjacent clusters, located within the same SU, with the same SI has been calculated (Fig. 11c). Thereafter, the MF matrix has been applied ( Fig. 11d) obtaining different degrees of hazard according to the size of the pixel clusters. Fig. 8 Slope units optimization metrics, calculated as a function of the SU delineation software input parameters, circular variance c and minimum area a, for 42 different combinations. a The values of F(a, c), defined in Eq. 7, quantifying the deviation between the observed first-time failure distribution and the predictions obtained as a function of the SU delineation parameters: the optimal values are found in correspondence of the minimum value of F(a, c), denoted by the black dot. b The aspect segmentation objective function proposed by Espindola et al. (2006), adapted for aspect segmentation as in . A green dot is shown in correspondence to the optimal values by the procedure developed in this work and in Domènech (2015), and whose objective function is shown a Fig. 9 Cumulative frequency size distribution belonging to the highest-susceptibility class (orange line) and obtained after the pixel clustering using the SU that shows the best agreement, according to the metric defined in Eq. 7, and shown in Fig. 8, with the observed landslide cumulative frequency obtained (first-time failures) (full circles) It is important to notice that within the same SU, pixel clusters of different susceptibility are possible. In terms of probability of failure, we expect a frequency of failures of each susceptibility class according to the results of Table 7. The hazard classes of the clusters of pixels (Fig. 11d) are defined based on the pairs of frequency of failures-size of the pixel cluster. Thus, despite some pixel clusters being classified as high susceptibility in Fig. 11c, they may become medium hazard if their size do not exceed the established threshold (Fig. 11d).

Conclusions
A supervised procedure has been developed and applied in the Barcedana Valley (Eastern Pyrenees, Spain) to evaluate the frequency and magnitude of first-time slope failures. A deterministic model is used to obtain susceptibility classes, which is applied to the SUs to generate the magnitude-frequency matrix. In particular, the aim was to map predicted potential failures of intact slopes with size considerably larger than the pixel size, but smaller or equal than the slope unit in which they fall in. Compared to other hazard analysis approaches using the entire SU (e.g. Guzzetti et al. 2006), the procedure developed here is able to identify and define the area mostly likely to fail within the SU, i.e. most susceptible part of the slope only.
The susceptibility assessment has been performed using the physically based SINMAP model (Pack et al. 2005). The probability of failure has been reclassified into a classified pixel-based SI map.
The magnitude of the potential failures has been obtained performing an automatic aggregation of groups of adjacent pixels with the same susceptibility class, located within the same SU and having an area smaller than a given threshold. In that case, SUs were used as aggregation domains. To this end, we have developed a procedure to split the clusters of pixels of the same SI among different adjacent SU. Since a predefined SU delineation for a given study area does not exist, several predictions using different SU delineations, including SUs of different sizes and shape, were carried out using the parametric r.slopeunits software developed by . We have checked whether the number of pixels of a given susceptibility class depends on the SU size. The magnitude of the potential failure is calculated as the size of the pixel cluster of the highest susceptibility class within the SU. The latter has been chosen by maximizing the agreement with the observed landslide data. The landslide size distribution modelled in this way is considered as our best estimate of the magnitude component of landslide hazard.
The first-time slope failures, mapped by analysing stereoscopically a set of orthophotos covering a 57-year period, have been used to calculate the frequency of each susceptibility class. The frequency was calculated by dividing the percentage of observed first-time failures, falling in each susceptibility class, by the time span covered by the orthophotos, and normalised by the area occupied by each susceptibility class. In this way, we have obtained an estimate of the temporal frequency component of landslide hazard. The procedure of landslide frequency obtained matches consistently with the landslide susceptibility classes generated (Table 7). We have checked that the influence of SU size on the determination of the frequency of the potential failures is negligible.

Original Paper
A magnitude-frequency matrix has been prepared considering that the frequency prevails over the magnitude for low and very low frequency classes. The landslide hazard maps have been obtained considering the optimal SU delineation and applying the MF matrix. Our procedure shows that not all the most susceptible pixel clusters are the most hazardous as the latter also depends on the size of the potential failure (Fig. 11): for small SUs, the final hazard degree is typically smaller, and vice versa. Hence, the SU size and the mapping units in general play an important role to obtain the final results Pavel et al. 2008;Catani et al. 2013;Erener and Düzgün 2013) and the best SU delineation should be chosen after checking their consistency with the observed slope failures.
The general framework can be applied using any physically based model with a pixel-based output, which must be properly classified. The presented methodology allows assessing the hazard in a semiautomatic and objective way, and considering the landslide size instead of individual pixel values or the whole slope unit, thus using both pixel and mapping unit information to the maximum extent. Fig. 11 a Hazard map of the whole area. b Reclassified susceptibility map obtained from SINMAP. c Susceptibility map after the pixel clustering within each SU. d Hazard map obtained after the application of the MF matrix for each cluster of pixels. Note that a high-susceptibility class in b may yield either be a high or a medium hazard class depending on the magnitude of the expected slope failure