Landslide susceptibility mapping of Gdynia using geographic information system-based statistical models

This work aims to prepare a reliable landslide susceptibility model and to analyse the factors contributing to landslides in a dynamic environment by considering the city of Gdynia, Poland as a case study. Geological, geomorphological, hydrological, hydrogeological, and anthropogenic predisposing factors are considered using geographic information systems. Ground types at different depths (1 m and 4 m b.g.l.) are used in the statistical susceptibility assessment for the first time. Landslide susceptibility maps are developed using two techniques in presenting landslides, 13 conditioning factors, and three statistical methods: landslide index, weight of evidence, and logistic regression. The considered factors have an influence on mass movement formation, but their roles are different. Many of these passive factors are interrelated and some of them are also related to active factors, i.e. triggers. Consideration of many thematic layers in the statistical approach allows for the selection of the most appropriate geo-environmental variables. The most significant conditioning factors that affect the likelihood of landsliding include land use and land cover as well as topography. The susceptibility maps generated by the index model and many interrelated passive factors appear to be over-predicted. The logistic regression model and only independent controlling factors (slope angle, slope aspect, and lithology) are sufficient to compile a reliable susceptibility map of Gdynia. Prediction rate curve plots show that the susceptibility map produced using logistic regression exhibits the highest prediction accuracy. The results emphasize the need to check independence in the selection of instability factors and the use of an independent subset of landslides for validation.


Introduction
Landslides are natural morphodynamic processes, which can trigger large-scale disasters if they occur in residential areas or industrial estates. As a result of demographic trends of the last century, 74% of the European population now reside in cities (United Nations 2019). In Poland, city residents constitute 60% of the population (Demographic Yearbook of Poland 2019). In the twenty-first century, urban population growth is predicted to continue (United Nations 2019), and the resulting increase in urbanization will affect the environment, especially geomorphological processes and their dynamics. The combination of geo-environmental conditions likely to cause landslides and the substantial build-up of risk factors make the scale and impact of mass movements on urbanized areas quite significant.
Landslide susceptibility refers to the probability of occurrence of mass movement in a given area due to the existing environmental conditions (Varnes 1984). In Poland, urban area susceptibility assessments have only been conducted for the city of Gdańsk (Małka 2015). Internationally, urban area landslide susceptibility assessment has been performed in a few urban areas such as Sasebo, Japan (Xie et al. 2003), Kuala Lumpur, Malaysia (Althuwaynee et al. 2012), and Seoul, South Korea (Lee et al. 2017). However, this aforementioned issue is very significant, because in areas managed by man even small landslides disrupt the flow of goods and are expensive to clear (Clague and Stead 2012).
The approaches and methods for assessing landslide susceptibility can be grouped into five broad categories: geomorphological mapping, analysis of landslide inventories, heuristic or index-based approaches, process-based methods, and statistical methods (Reichenbach et al. 2018). Physical and statistical methods are preferred to ascertain landslide susceptibility in quantitative terms (Liu and Duan 2018;Hemasinghe et al. 2018;Murillo-García et al. 2019). According to Reichenbach et al. (2018), statistical landslide index (SLI), weight of evidence (WOE), and logistic regression (LR) models are the most common statistical landslide susceptibility models.
This study investigates landslide susceptibility assessment for urban areas using geographic information system (GIS)-based statistical models, considering the seaside city of Gdynia, Poland, as a case study. SLI, WOE, and LR models were developed, corresponding landslide susceptibility maps were obtained, and the results were compared.
A new aspect proposed in this study is the application of existing models to Gdynia and an objective determination of the impact of environmental factors on landslides in this area. Gdynia has available detailed large-scale topographic and environmental data. Its area is characterized by diverse intensity of hazards, i.e. more significant hazards along the coastline and less significant hazards further inland. The criteria for the selection of data representing the landslides are enumerated and the use of diverse thematic layers has a significant impact on the results of the analysis. Following the example of Reichenbach et al. (2018), in this paper, the terms 'landslide', 'slope movement', 'mass movement' and 'slope failure' are used as synonyms.

Study area
The city of Gdynia, Poland, is located in an area characterized by high relief and rapid development related to population growth. Along with the cities of Gdańsk and Sopot, Gdynia is part of a larger urban development region. Gdynia is located on the western side of the Gulf of Gdańsk (Fig. 1) between 54° 25′ 22″ N and 54° 35′ 4″ N and 18° 21′ 30″ E and 18° 34′ 7″ E, covering an area of 135.14 km 2 . Currently, Gdynia has a population of 246,309, with 1,823 people per km 2 (Demographic Yearbook of Poland 2019). The local climate is influenced by the relief and the distance from the Baltic Sea. The altitude ranges from 0 to 205.7 m above sea level. The maximum temperature reaches 35 °C during summer, while the minimum is − 20 °C during winter (Kańska et al. 2019). According to the local weather station, the average annual rainfall over the past 34 years has been approximately 530 mm, and most rainfall occurred in July (Kańska et al. 2019).
In terms of geomorphology, the study area is mainly located within the moraine plateau ( Fig. 1). It was formed during the Weichselian (Vistulian) glaciation as a result of the accumulation of moraine deposits (tills, sands, and gravels). During deglaciation, the moraine plateau was incised by a glaciomarginal outflow system (Mojski 2005), which caused the division of the moraine plateau into the main part in the west and isolated fragments in the eastern part ( Fig. 1). The edges of the moraine plateau were cut by erosional valleys. Erosional processes began from the late-glacial period and continued throughout the Holocene (Woźniak 2014).
The Weichselian glaciation and development of the Baltic Sea were of fundamental importance in shaping the present land surface. The Vistulian glaciation lasted from 115,000 to 11,700 years ago (Mojski 2005;Walker et al. 2009;Marks et al. 2016). The relief was created relatively rapidly, relating to the disappearance of the ice sheet over approximately 1000 years (Mojski 2005). The present-day Baltic Sea had its beginnings in the retreat of the last Scandinavian ice sheet, which melted approximately 15,500 to 14,500 years ago in the southern part of today's Baltic Sea (Uścinowicz 2014). On the western side of the Gulf of Gdańsk, the cliffs began to form approximately 6000 years ago and their general shape remains similar to this day (Uścinowicz 2003;Gałka et al. 2018). Currently, the sea forms the natural limits of the city along the 11.5 km of shoreline in the east (not including the seaport), 8.5 km of which are accounted for by cliffs (Fig. 1).
There are two main lithostratigraphic units in the research area. The older unit is composed of fine sands, silts, and lignite of the Neogene (Pikies 2001;Pikies andZaleszkiewicz 2004, 2013a, b;Woźniak et al. 2018). These deposits are only exposed in cliffs. The younger unit consists of glacial tills, fluvioglacial sands, and gravels, along with limnoglacial silts and clays of Pleistocene as well as Holocene fluvial sands, muds, and peat (Pikies 2001;Pikies andZaleszkiewicz 2004, 2013a, b;Woźniak et al. 2009Woźniak et al. , 2018Woźniak and Czubla 2016). Holocene alluvial and colluvial deposits occur along the valley bottoms. On the upland slopes, these deposits have been affected by erosion and glaciotectonic deformation, which makes the slopes vulnerable to mass movements.

Landslides in Gdynia
Owing to its coastal location, the area of Gdynia was threatened by slope movements even before the town was founded in 1926. The earliest information on threats related to mass movements in the area of present-day Gdynia dates to the end of the sixteenth century and relates to cliff coasts (Małka 2019). The maximum annual shoreline recession in the second half of the nineteenth century was 2 m. Currently, the maximum shoreline recession is 1 m per year (Małka et al. 2017). Landslides on cliff slopes are usually activated several times a year; these are typically successive and deep-seated. Unfortunately, there is no reliable data on the depth of their surface of rupture. The landslide near the village of Jastrzębia Góra (approximately 45 km to the northwest of the research area) is the only one that has been thoroughly studied by means of deep borings. The results of the inclinometer measurements indicate the depth to the surface of rupture at 5.5-9 m b.g.l. (Zabuski and Kulczykowski 2020). The factors that control cliff failures are wave action and physical-chemical weathering (Montoya-Montes et al. 2012). The important trigger that activated landslides in the study area was storms. In Gdynia, storms with an average speed of 17 m/s occur 1.6 times a year on average. From 1955 to 2015, the highest number of storm floods (six) occurred in 1993, while the highest number of storm hours (85) were noted in 2007(Kańska et al. 2019. A single storm can initiate several landslides in the coastal zone of Gdynia. Other factors that trigger these landslides are excessive rainfall and human activity. The first-documented inland failure (not on coastal cliffs), which damaged the railway embankment in Wielki Kack, occurred in 1930 (Małka 2019). In the first half of the twentieth century, inland mass movement only developed occasionally. Initially, intensive urban development occurred around the industrial area near the Port of Gdynia and at the foot of the uplands, in areas that primarily consist of sand and gravel deposits in the alluvial fans of numerous valleys. Currently, urban development is progressing into the moraine plateau area, which is characterized by significant change in elevations, resulting in strong transformations of the relief. At the turn of the twenty-first century, intensive urban development combined with an increase in the frequency of extreme hydrometeorological events contributed to a decrease in slope stability (Małka et al. 2017), leading to a significant increase in landslide risk, which is associated with an increase in both the number of hazards and the elements at risk (Aleotti and Chowdhury 1999). Unlike the landslides on cliff slopes, inland landslides are shallow at approximately 2-3 m (Małka et al. 2017;Małka 2019). On average, over the past six years, various triggers have activated three inland landslides per year. Inland landslides in Gdynia can be triggered by natural physical processes such as heavy daily rainfall (> 100 mm), rapid snowmelt, and slope undercutting by rivers. They can also be generated by human impacts such as slope excavation and loading (e.g. road and construction works), redirecting rainfall-runoff in such a way that the flow is concentrated, adding water to the slope by irrigation, and land cover changes (e.g. deforestation), or a combination of natural and anthropogenic processes (Małka et al. 2017).
Various forms of damage have been caused by previous slope failures in Gdynia. For example, considerable destruction occurred within the city after heavy rainfall in July 2016. The damage included failure of infrastructure embankments (railways and roads), damage to and burying of several buildings, shifting and cracking of retaining walls, destruction of a car park, burying of cars, destruction of a bridge, destruction of stairs descending on to the beach, and destruction of a playground (Małka et al. 2017). In February 2017, after rapid warming, a road was destroyed, a historic villa and a car were buried following a landslide in Kamienna Góra (Fig. 2). This phenomenon was also probably associated with the heavy rainfall that occurred in July 2016. In other words, rapid snowmelt induced slope failure; however, this snowmelt was preceded by water retention that occurred several months earlier. Thus far, the mass movements in Gdynia have not resulted in any casualties. However, considering their dynamic characteristics (velocity of 5-15 m/s), they could prove fatal in the future (Małka et al. 2017;Małka 2019).

Landslide inventory
For the case study of Gdynia, the 'Map of Landslides and Risk Areas for the City of Gdynia, 1:10,000' (Szarafin et al. 2015), which was updated in 2018, was used as the basic data source. The 2018 landslide inventory was prepared by combining multiple techniques (Table 1). After heavy rainfall in July 2016, extensive fieldwork was carried out to obtain a large number of in situ observations. The slope failures in Gdynia were classified during fieldwork considering material and movement type based on the modified Varnes' landslide classification (Cruden and Couture 2011). In Gdynia earth slide (typically rotational), earthflow (wet and dry), or combined complex earth slide and earthflow occur dominantly (Fig. 2). Additionally, however, less often, earth fall and earth topple in glacial till occur in the cliff area (Małka et al. 2017;Fig. 2). In total, 180 landslides were mapped in the study area (Fig. 3), 95% of which were slides or complex slope movement. Because inland failures are generally small, differentiation of their flow from the complex type is very difficult. The total landslide area was 94 ha. The smallest landslide disturbed an area of approximately 172 m 2 , whereas the largest landslide, which occurred along the coastal slopes, disturbed an area of 84,397 m 2 (Małka et al. 2017). On average, a landslide affects an area of 8314 m 2 , whereas the median affected area is 3006 m 2 . In Gdynia, slope instability is determined by natural factors due to climatic conditions, slope geometry, presence or absence of cliffs, and anthropopression associated with high density of population.

Landslide conditioning factors
First, thematic layers were collected, sorted, and prepared to showcase the conditioning factors. The research considered geological, geomorphological, hydrological, hydrogeological, and anthropogenic conditions (Table 1). Thirteen selected passive factors visualized in the form of maps were used to model the landslide susceptibility of the studied area (Fig. 3). Most of the geofactors were obtained by further processing of the original data. The thematic data used in the analysis have a cartometric accuracy of a 1:10,000 map. The resolution (pixel size) of all the raster maps used in the analysis was 5 × 5 m. Based on inputs from previous studies such as Conoscenti et al. (2008), Devkota et al. (2013), Youssef et al. (2016), Wang et al. (2016), Lee et al. (2017), and Pamela et al. (2018) the following topographic attributes were considered: slope angle, slope aspect, vertical curvature, elevation, relief energy (computed in the vicinity of 5 m from each raster cell), topographic wetness index (TWI), stream power index (SPI), and slope length factor (LS).
These factors play different roles in the formation of landslides. Slope angle controls the balance of retaining and destabilizing forces acting on a slope (Wu and Sidle 1995). Slope aspect affects soil moisture, precipitation intensity, and vegetation density of a slope (Wang et al. 2016;Saleem et al. 2019;Magliulo et al. 2008). Elevation determines the degree and type of erosion and human activity (Gritzner et al. 2001;Dai and Lee 2002). Hydrological conditions such as TWI and SPI also have an influence on mass movements. TWI describes the effects of topography on the spatial distribution and extent of saturated areas (Wilson and Gallant 2000). SPI is a measure of the erosive power of water flow (Moore et al. 1991). The LS index calculates the spatially distributed sediment transport capacity, predicts the locations of areas of net erosion and net deposition, and shows the spatial distribution of soil loss potential (Wilson and Gallant 2000). Note that the LS factor is the ratio of the expected soil loss from a field slope relative to the original universal soil loss equation unit plot and represents the effect of the slope's length and steepness on splash, rill, and sheet erosion due to water (Wischmeier and Smith 1978).
The above attributes were calculated directly from a high-resolution digital elevation model (DEM) obtained via airborne laser scanning (ALS). These data are also described as light detection and ranging (LIDAR) data (Lohani and Ghosh 2017). The point density within an ALS-based point cloud in Gdynia was 12 pts/m 2 . The Jenks natural break  (Szarafin et al. 2015), digital elevation models*, aerial photographs*, orthophotographs*, periodic field investigations*, archival documents*, geological and engineering documentation*, monitoring data*, stability analyses*, scientific publications*, press articles and internet sources*, Google Earth™ imagery* Land use and land cover 1 3

Fig. 3
Conditioning factors used in GIS analyses algorithm was used for grouping slope angles, elevation, relief energy, TWI, SPI, and LS (Małka 2015;Veerappan et al. 2017;Arabameri et al. 2020). The remaining thematic maps included land use and land cover (Urban Atlas 2012), lithological, and groundwater table maps (Frankowski et al. 2007). Land cover controls slope stability through increased or decreased evapotranspiration and root strength. Land use is associated with urbanization processes. Urbanization results in irrevocable changes to the landscape, the spread of the built environment, and an increase in anthropogenic activities (Sharma et al. 2019).
Lithological data are presented using maps showing the geological structure in a horizontal cut at a depth of 1 m and 4 m b.g.l. These maps were prepared by Euclidean allocation based on representative archival boreholes collected in a database (Frankowski et al. 2007). A layer indicating distances from watercourses and reservoirs was also used (Fig. 3, Table 1). Fluvial processes such as river erosion and sea abrasion as well as drainage density determine the likelihood of a landslide in young morainic areas. Close river proximity induces instability at the foot of a slope due to the concentration of groundwater flow; further, river and sea undercutting also have destabilizing effects (Reichenbach et al. 2018).

Landslide susceptibility analysis
Landslide susceptibility is determined by considering the interplay between landslide occurrences and impacts of factors that induce instability. In the case of Gdynia, because of the small number of active landslides, all landslides (180) were included to achieve reliable geoinformation analysis using statistical methods. Both the entire landslide area (landslide body) and points located on the main landslide scarps (landslide detachment zone) were considered for modelling (Magliulo et al. 2008). First, the landslide dataset (100% including 180 landslides) was divided into training and validation sub-datasets covering 70% and 30% of all landslides, respectively (Fig. 3). The training sub-dataset and thematic maps were used to build a training model, while the other sub-dataset was used for validation. As stressed by Chung et al. (1995) and Chung and Fabbri (2003), random partition is a mandatory step in prediction modelling. Because of the concentrated spatial distribution of landslides in Gdynia, stratified random sampling was used to ensure the representativeness of the training and validation sub-datasets. Stratification was based on the four main geomorphological forms distinguished in Gdynia: glacial and fluvioglacial, denudational, river forms and sea forms (cliffs) (Pikies 2001;Pikies andZaleszkiewicz 2004, 2013a, b).
Landslide susceptibility maps were developed using three types of statistical methods: SLI, WOE, and LR. The mathematical foundations of these methods are explained in detail in the literature (Bonham-Carter 1994;Van Westen 1997;Hilbe 2009).
Landslide inventory and maps of environmental factors (Fig. 3) were used to calculate the landslide susceptibility index (LSI) using the SLI method (Van Westen 1997; Sarkar et al. 2008). The bivariate SLI method has been used by various researchers (Sarkar et al. 2008;Yalcin 2008;Magliulo et al. 2008;Magliulo 2010;Pourghasemi et al. 2013;Małka 2015;Wang et al. 2016;Liu and Duan 2018). The SLI method is based on the following formula: (1) where LSI-landslide susceptibility index; W i -the weight is given to a certain parameter class; Densclas-the landslide density within the parameter class; Densmap-the landslide density in the whole study area; Npix(Si)-number of pixels which contain landslides in a certain parameter class; Npix(Ni)-total number of pixels in a certain parameter class. Additionally, predictor rate (PR) was used to determine the weight of individual factors. Equation (2) was applied to the rating of every spatial factor of the landslide's dataset.
where SA is the index of the spatial association of spatial factors and landslides (Althuwaynee et al. 2014).
The second method used to model landslide susceptibility is the bivariate WOE method. It is a log-linear version of general Bayesian theory (Bonham-Carter et al. 1989;Bonham-Carter 1994), and it is frequently used to determine the significance of the impact of instability factors on the occurrence of mass movements (including Van Westen 1993; Lee et al. 2002;Van Westen et al. 2003;Klingseisen and Leopold 2006;Blahut 2009;Pamela et al. 2018;Liu and Duan 2018). This model is based on the following formulas: where N pix -the number of pixels representing a given variable; L, F-the presence of a landslide and the presence of a conditioning factor; ∼ L, ∼ F-no landslide and no conditioning factor.
The third method used to model landslide susceptibility is the multivariate statistical method of LR. The LR method has been used by various researchers including Wiegand (1996), Ayalew et al. (2005), Trigila et al. (2015), Bai et al. (2015), Lee et al. (2016) and Hemasinghe et al. (2018). The LR method is expressed as follows: In Eq. (5), P is the probability of a landslide occurrence and varies from 0 to 1 on an s-shaped curve, z represents a linear logistic model and varies from − ∞ to + ∞. The linear logistic model is defined as In Eq. (6), b 0 is the intercept of the model, n is the number of independent variables, b i (i = 1, 2, 3, …, n) is the regression coefficient of the model, and x i (i = 1, 2, 3, …, n) is the independent variable (Bai et al. 2015).
The performance of the susceptibility model was evaluated using the success and prediction rate criteria of Chung and Fabbri (2003). In susceptibility assessment, many different sets of factors were included. The success rate curves (SRCs) were used to select the significant variables for landslide susceptibility analysis (Remondo et al. 2003). The independence of variables was checked using chi-square (χ 2 ) tests in ArcSDM® software (Sawatzky et al. 2010). The best models selected according to SRCs were used for generating prediction rate curves (PRCs) and final susceptibility maps.
The application of a high number of variables, various methods of presenting landslides, and various statistical methods enabled the selection of the best model and compare the results. All susceptibility models were developed using the classes determined by the Jenks algorithm (Conforti et al. 2011;Devkota et al. 2013;Youssef et al. 2016;Veerappan et al. 2017).
The selection of SLI, WOE, and LR models in assessing the landslide susceptibility of Gdynia was determined by their advantages (effectiveness, efficiency, and universality). In addition, the statistical approach allows the interpretation of the role of individual passive factors in the formation of landslides, which allows for the comparison of results. The landslide susceptibility models were developed using ArcGIS 10.3.1 and ArcSDM® (Spatial Data Modeller) software (Sawatzky et al. 2010).

Landslide susceptibility map developed using SLI
Various combinations of variables and numerous methods of presenting landslides were used in assessing landslide susceptibility via the SLI method (Tables 2 and 3). The roles of contributing factors determined using the PR for the entire landslide area and points located on the main landslide scarps are different. PR significances of individual passive factors were different for the overall landslide dataset (100% landslides, 180 failures) and training sub-dataset (covering 70% of all landslides), which are presented in Table 3.
More reliable results were obtained through the SLI method using points located on the main landslide scarps. Verification of landslide susceptibility maps using the success curves demonstrated the lower suitability of entire landslide areas for susceptibility analysis. Therefore, only the results obtained for the detachment zones are described in detail below.
In the modelling of landslide susceptibility of Gdynia using SLI, the most significant geo-environmental variables were land use and land cover as well as the topographic attributes such as slope angle, TWI, relief energy, LS, vertical curvature; noteworthy were also local geology and hydrogeological conditions ( Table 2). The impact of the most important conditioning factors on all mass movements is presented in Table 4, and in detail in Supplimentary Appendix 1.
In Gdynia, the value of LSI index rises gradually with increasing slope. Steeper slopes (above 41°) are more prone to landslides (lnWi = 4.97). Relief energy also controls mass movements (Table 3, 4); the LSI values reaches high values for relief energy 1-3 m in the case of 77 landslides (lnWi = 1.64) and for relief energy 3-5 m in the case of 42 landslides (lnWi = 4.52). A clear spatial correlation exists between LS index and slope failures. In Gdynia, high LSI values are associated with LS intervals of 39-61 (33 landslides) and 61-99 (29 landslides), with lnWi values of 3.94 and 4.75, respectively. Interdependence between increase in LS index value and intensities of slope failures is evident.
For Gdynia, high values of LSI were observed in the coastal zone (lnWi = 3.78); note that 36.11% of landslides is associated with this zone. Slope failures in the shore       zone are initiated by sea abrasion. High values of LSI index also appear for green areas (lnWi = 1.47).
Other factors controlling mass movements are geological and hydrogeological conditions. In Gdynia, the highest values of LSI (lnWi = 1.12) were observed for the TWI interval of 0-3 accounting for 43.33% of landslides. This indicates the link between predominant shallow mass movements and soil moisture parameters in Gdynia. A shallow groundwater table of 0-1 m is frequently associated with the surface of rupture and demonstrates its role in landslide susceptibility. Geological structure, mainly diversity of lithological layers, is another significant factor in the formation of mass movement in young morainic areas. The coexistence of consolidated and unconsolidated materials and cohesive deposits favours rainwater infiltration and the initiation of surface of rupture (Małka 2015;Małka et al. 2017). For a ground type at a depth of 1 m b.g.l., high thickness of soils (≥ 1 m), silty sands, and embankments and tills exhibit high landslide susceptibility with lnWi values of 0.90, 0.36, 0.30, and 0.12, respectively. Note that for a ground type at a depth of 4 m b.g.l., embankments as well as sands with gravels contribute to the occurrence of landslides with lnWi values of 0.65 and 0.08, respectively.
In this analysis, the use of the statistical index method and 13 thematic layers has revealed the importance of many environmental factors. Various combinations of instability factors were used in the modelling. Figure 4 presents the landslide susceptibility map with a scale of 1:10,000 created using calculations through the SLI technique, 10 variables, and all landslides (Model SLI P 5). According to this model, the very low landslide susceptibility class (impossible, unlikely) occupies approximately 33% of the area of Gdynia, the low possibility zone accounts for 17% of Gdynia, possible and moderately possible zones are connected with 38%. Highly possible zone covers 12% and is distributed on coastal cliff as well as in numerous valleys of the moraine plateau area.

Landslide susceptibility map developed using WOE
Initially, 13 instability factors presented in the form of maps (Fig. 3) were used in modelling landslide susceptibility of the studied area using the WOE method. The WOE method allowed for the determination of the importance of predisposing factors. Supplimentary Appendix 2 shows the spatial relationships between each of the passive factors and all landslides calculated using the WOE technique. Weights (W + , W − ) and contrasts (C) were calculated for 13 predisposing factors (Supplimentary Appendix 2).
The equations of the WOE method are correct when the predictors are conditionally independent (Bonham-Carter 1995). Various combinations of instability factors were used in the modelling (Table 3; Supplimentary Appendices 2, 3). Owing to the pair's conditional dependence, the following four predictors were ultimately applied in modelling susceptibility using the WOE technique: slope angle, slope aspect, ground type at a depth of 1 and 4 m b.g.l. (Model WOE P 1). These results are presented in Table 5.
Landslide intensity increases with a rise in the predisposing factor of slope, which is reflected in the increase in weights (W + ) and contrast (C). Strong and very strong predictive significances are associated with slope angles from 23° to 63°. In the case of slope aspect, eastern slopes (W + = 0.72) are most predisposed to landslides, which should be associated with the orientation of coastline.
Geological structure is another factor that contributes to the conditioning of the formation of mass movements. In the case of ground type at a depth of 1 m b.g.l., soils with high 1 3

Fig. 4
Landslide susceptibility maps of Gdynia developed using statistical landslide index (SLI), weight of evidence (WOE), and logistic regression (LR) methods

Table 5
Spatial relationship between the more important conditioning factors and all landslides using the weight of evidence (WOE) method   The final landslide susceptibility map of Gdynia (with a scale of 1:10,000) created using calculations through the WOE technique and based on all landslides is presented in Fig. 4 (Model WOE P1). According to WOE P1 model, areas where the development of landslides is impossible or unlikely occupy approximately 65% of the area of Gdynia, zones with low possibility of landslides account for 29%, and possible and moderately possible classes zones are connected with 5%. However, only 0.03% of the study area is covered by highly possible landslide occurrence zones and is distributed irregularly on coastal cliff.

Landslide susceptibility map developed using LR
The four predictors of slope angle, aspect, ground type at 1 and 4 m b.g.l., as in the case of WOE modelling, were also used for LR susceptibility modelling.
The relative importance of independent variables in predicting the occurrence of landslides is expressed by the regression coefficient ( Table 6). The results reveal that the slope angle factor has the highest coefficients (β = 9.04, 10.93, 11.62, 11.82, 13.01, 14.23, 15.33, and 15.72), and it is the most influential factor in indicating the occurrence of landslides in the area. The value of the coefficient increases in proportion to the slope angle. Strong predictive significance is associated with a slope angle of 14°-63°, indicating a direct correlation with landslide occurrence. Slope aspect and ground type make a less considerable contribution to the occurrence of landslides. With regard to slope aspect, the east-(β = 0.55), north-(β = 0.09), and northeast-oriented (β = 0.20) slopes have a high probability of landslide susceptibility as they have a positive coefficient. The northwest-, south-, and southwest-oriented slopes have negative coefficients and are, thus, correlated with stable areas. In terms of geology at a depth of 1 m b.g.l., silty sands (β = 0.59), embankments (β = 0.39), soils (β = 0.26), tills (β = 0.04), and sands with gravels (β = 0.02) are the most susceptible to sliding. However, for a ground at 4 m b.g.l. depth, sands with gravels, and embankments (β = 0.30 and 0.19, respectively) are susceptible to landslides. Sands at a depth of 4 m b.g.l. has a negative effect in landslide formation as it has negative coefficients.
The final map of the Gdynia landslide susceptibility (with a scale of 1:10,000) created using calculations through the LR method and based on all landslides is presented in Fig. 4 (Model LR P1). With regard to the LR P1 model, there is a clear prevalence of zones with very low and low landslide susceptibility. Areas where the development of landslides is impossible or unlikely cover approximately 60% of Gdynia. However, areas with low possibility of landslide occurrence account for 20% of Gdynia. Possible and moderately possible classes are associated with 18% of the study area. Zone of high landslide possibility represents only 1.25% of Gdynia and occurs on coastal cliffs as well as in a few valleys in the south-east part of the moraine plateau area. On coastal cliffs, the highest landslide susceptibility is located at the top of the slope due to the fact that they are steep.

Performance evaluation of models
Different indices are used to evaluate the performance of a susceptibility model (Reichenbach et al. 2018). To measure performances of the landslide susceptibility prediction models, their prediction performances were evaluated using model fitting techniques. In the case study of Gdynia, the statistical landslide susceptibility models were compared using SRCs and PRCs. Different combinations of variables were used to select the most suitable function, significant conditioning factors, and improving the efficiency of the susceptibility mapping process (Remondo et al. 2003).
In the first stage, 12 SRCs were used to analyse the performances of the statistical models (Fig. 5). These 12 models include various methods of presenting landslides (the entire landslide area and points located on the main landslide scarps) and different combinations of passive factors (Table 3). Based on the fitting of SRCs, a comparison of Model SLI A1, Model SLI A2, and Model SLI A3 with other models show that the application of the entire landslide area provides poor results (Fig. 5). However, the use of more passive factors significantly improves the results (Model SLI A1 and Model SLI P1).
In general, analyses of SRCs shapes show that the use of a higher number of conditioning factors improved the results of SLI methods, which was most noticeable when comparing four (Model SLI P3) and 10 conditioning factors (Model SLI P 4). For more than 10 layers (13 factors-Model SPI P1), no marked improvement was seen (Fig. 5).
Note that model SPI P2, which was performed without LS, slope angle, TWI, and relief energy, resulted in the worst susceptibility map (Fig. 5). Therefore, these factors are the most important in susceptibility assessment of Gdynia (Table 2). In the analyses by WOE and LR techniques, the best models (Model WOE P1 and Model LR P1) were constructed by a combination of slope angle, slope aspect, and ground type at a depth of 1 and 4 m b.g.l. Lithology, slope, and slope aspect are also often used for susceptibility mapping by many other researchers (Ayalew et al. 2005;Yalcin 2008;Magliulo et al. 2008). The use of the LS index instead of slope angle (Model WOE P2 and Model LR P2) did not improve the results (Fig. 5).
In the next stage, the most suitable SRC function from each technique (SLI, WOE, and LR) was selected. For the best four models, statistical verification was performed with the help of PRCs (Fig. 6). The shapes of these curves indicate that the LR P1 susceptibility model shows better performance than the WOE P1 model and all the SLI models (Fig. 6). The LR P1 model was created by considering the interactions of a greater number of factors and thus, it provides a more realistic picture of landslide susceptibility. This conclusion is in line with an expert opinion. If a susceptibility map was based on geomorphological fieldwork, it would be similar to the LR P1 map. On the SLI P4 map (Fig. 4), the zone of high landslide possibility includes numerous valleys on the moraine plateau, where no landslides were observed during fieldwork, and the slopes were stable. Additionally, the SLI P4 map does not show that the occurrence of mass movements is more likely in the coastal zone than in inland areas. On the other hand, the WOE P1 map shows that in many cliff areas where active landslides were observed (e.g. Orłowo Cliff- Fig. 2a) moderately possible zones occur (the LR P1 map shows highly possible zones).

Discussion
Over the next 50 years, the frequency of slope failures and associated threats to urban infrastructure and buildings in Gdynia are likely to increase. This is evident in the context of the city's demographic trends, global warming, and the increase in frequency  (Małka et al. 2017;Kańska et al. 2019). Unfavourable directions of demographic change have been progressing since 2001 as the flat areas are depopulated (this is associated with higher house prices), while population is increasing in morphologically diverse areas characterized by high relief energy. Following the increase in population, construction of residential areas in these vulnerable areas is also intensifying. The observed landslides were caused by the cumulative effect of preconditioning and triggering factors.
The selection of landslide representation approach, instability factors, and statistical methods is significant in the GIS-based statistical analysis of landslide susceptibility. High quality and adequacy of geospatial data are also important. The use of the most accurate and up-to-date information on landslides enables implementation of effective susceptibility models (Galli et al. 2008). The selection of data representing landslides has a significant impact on the results of the analysis (Magliulo et al. 2008). In the case of Gdynia, to meet the above-mentioned conditions a detailed inventory of landslides was made and two methods of representing landslides were used. The use of points representing the detachment zone yielded better results than the use of the entire landslide area, which is also confirmed by the results of earlier studies (Magliulo et al. 2008). According to Magliulo et al. (2008), differences between the landslide and the landslide detachment zone must occur in geo-environmental classes prone to failure. Moreover, calculating weighting values considering the detachment zone provides more satisfying results (Magliulo et al. 2008). Some researchers recommend the application of seed cells that represent the conditions before a landslide occurs, which is achieved by adding a buffer zone to the crown of the landslide (Süzen and Doyuran 2004;Conoscenti et al. 2008).
Data availability and nature of the study area are significant in preparing thematic layers for susceptibility assessment (Ayalew et al. 2005;Magliulo et al. 2008). Comprehensive topographic and environmental databases that are created for cities are essential for GIS analysis. The selection of geospatial data in this study was determined by relevance, reliability, and compatibility of scales. The accuracies of thematic layers are important for Fig. 6 Prediction rate curves for selected models susceptibility mapping. However, there is a lack of standardization in the selection of conditioning parameters (Ayalew et al. 2005;Yalcin 2008). Ayalew et al. (2005) attempted to solve this problem by drawing attention to five important criteria. In GIS-based studies, the selected factor should be operational, complete, non-uniform, measurable, and nonredundant (Ayalew et al. 2005;Yalcin 2008;Magliulo et al. 2008). Although Gdynia has a substantial amount of geospatial data, much of these (e.g. detailed geological map of Poland, geoengineering data, and geodynamic maps) were rejected in the analysis as they did not meet the above-mentioned criteria. Initially, the analysis of landslide susceptibility in Gdynia included dependent data to determine the best layers. The most common factor used in the susceptibility analyses was slope angle (Reichenbach et al. 2018). Presumably, slope angle can be replaced by some other topographic attribute. The results obtained in this study, verified by success or prediction curves, indicate that slope layer is the most appropriate factor. According to Chung and Fabbri (2003), Remondo et al. (2003), and Ayalew et al. (2005), an assessment of independence of layers to be used in susceptibility maps is essential for landslide susceptibility mapping. Although independence tests of layers are important, they are not considered in many studies (e.g. Devkota et al. 2013;Murillo-García et. al. 2019). Numerous papers (e.g. Conforti et al. 2011Magliulo 2012) suggest avoiding the simultaneous use of mathematically interrelated factors (e.g. topographic attributes such as LS, SPI, TWI, and slope angle) because they produce a double effect in the final susceptibility map. Magliulo (2012) suggests that redundant factors should be avoided even if their geomorphological and hydrological significances are different and relevant. In the case of Gdynia, there is a strong conditional relationship between slope angle and TWI, SPI, LS, elevation, vertical curvature, land use and land cover as well as distances from watercourses and reservoirs (Supplimentary Appendix 3). The chi-square (χ 2 ) tests also show the dependence of geological data, slopes, and groundwater level. Groundwater occurs in unconsolidated, permeable sediments (sands and gravels), and tills are usually impermeable; groundwater level also depends on the slope angle.
According to Remondo et al. (2003) validation helps select significant variables. In the case of Gdynia, validation of susceptibility maps of an independent group of landslides (using PRCs) proves that the use of the independent layer in more advanced mathematical models such as LR provides better results than that using many dependent factors in a basic method such as SLI (Fig. 6). Prior research in other regions confirmed that a high number of variables included in the analysis does not necessarily increase the quality of the model (Remondo et al. 2003). This study confirms that the use of an independent dataset in modelling and validation is fundamental. However, validation of susceptibility maps of a dependent group of landslides (using SRCs) provided better results with many instability factors in the SLI method. Still, the SLI models (e.g. SLI P4 and SLI P1) exhibit overprediction (Fig. 4). Zones of high landslide possibility include whole valleys on the moraine plateau where no landslides were observed. It results from the fact that the mathematical model of SLI method does not take into account areas without landslides.
Moreover, the fact that the dependence of factors is probably influenced by the scale and accuracy of data cannot be ignored. According to Tobler (1970), 'everything is related to everything else, but near things are more related than distant things'. Field studies in Gdynia indicate that slope angles depend on geological structure, e.g. the slopes composed of glacial tills are steeper (Fig. 2). When the resolution of geological data is close to that of the digital elevation model (DEM), slope and geological structure are dependent variables.
It is important to note that not all the relationships between the conditioning factors and landslides observed during fieldwork can be easily modelled through statistical methods. Limitations arise not only because of the accuracy of data and many active factors (triggers), but also because of the dynamic, changeable, and expansive characters of the processes that occur in urban areas. An analysis of these limitations will not only allow for a better understanding of geo-environmental factors, but also provide insights that can be used in the future susceptibility assessments in urban areas.
Rainwater infiltration is an important climatic trigger leading to the activation of downslope movements in Gdynia. The threshold value that initiates mass movements in Gdynia is > 100 mm of daily precipitation (Małka et al. 2017). The amount of rainfall in Gdynia is closely linked to altitude (Fig. 7). Recent slope failures in Gdynia were mainly induced by high-intensity rainfall and human activity. After heavy rainfall in July 2016, many rapid earthflows occurred, which had an impact on urban infrastructure, mainly on the Pomeranian Metropolitan Railway. During this period, the daily precipitation in the centre of Gdynia was 110.4 mm (Kańska et al. 2019). The worst damage occurred in parts of the morainic plateau at high altitudes, where precipitation was intense (Fig. 7).
Another hydrometeorological factor triggering landslides in Gdynia is the sudden winter warming that extends to several days. In Gdynia, freezing and rapid thawing leading to slope failure is usually preceded by strong soil irrigation associated with heavy rainfall, which occurs several months earlier. It is crucial to note that the landslides that occurred in Gdynia in February 2011 and 2017 developed after extreme precipitation in July 2010 and 2016, respectively. A similar process was observed by Zabuski et al. (2015) in other areas of the Polish Lowlands. Therefore, the increased risk of mass movements in the city must be considered. Failure triggers related to rainfall events and inputs from rapid snowmelts occur in other parts of the country as well (Mrozek et al. 2016).
During the geological and historical periods, mass movements in Gdynia were more intense than those in the present day (Małka 2019). The oldest and largest slope failures occurred during the Weichselian or early Holocene and were caused by melting of the permafrost. The paleo-landslide mechanism then was probably analogous to that of the present-day Arctic. In warm periods, thaw consolidation at the base of the active layer of the slope or in the thawing transient layer induced high porewater pressure and reduced the effective shear strength (Lewkowicz and Way 2019). Many of the landslides in Gdynia are associated with late-glacial and post-glacial decohesion processes. According to Cossart et al. (2013), post-glacial decohesion is often coupled with over-deepening of valleys and slope steepening, which generates instability.
Landslide distribution, highest slope angles, and highest relief energy values can be correlated for the most part with the old drainage network formed during the Pleistocene, particularly with tunnel valley edges. Additionally, several landslides have been observed in areas of buried tunnel valleys. The contemporary landform in the case of tunnel valleys is analogous to that of the sub-Quaternary topography (Pikies 2001;Pikies andZaleszkiewicz 2004, 2013a, b).
The geological structure in Gdynia is highly variable even over several metres, which cannot be represented with a scale of 1:10,000. Glaciotectonics is important in the formation of landslides. Miocene lignitic clays and silts are sensitive and tend to swell; when these encounter water, their volume increases to a critical level, and then decreases sharply. This is a very dangerous phenomenon for buildings and is difficult to model. It is not clearly reflected by the weights calculated using SLI (lnW = 0.09) and WOE (W + = 0.08).
The landslide susceptibility assessment of Gdynia shows that thick soils (≥ 1 m) is an important determinant of slope instability as it is likely to experience degradation and erosion. The moraine plateau is mainly composed of brown soil containing clay minerals (illite) in the cambic horizon. The parent materials (bedrock) of these soils are glacial tills (Pikies 2001;Pikies andZaleszkiewicz 2004, 2013a, b). Presumably, the stratification of the soil horizon and the presence of clay minerals favour water stagnation in lower horizons.
The modelling results and field study reveal the significant role of embankment characteristics. In July 2016, a dozen landslides occurred within the embankments of Gdynia (Fig. 7). The very poor compaction of these embankments influences mass wasting. The mineral material is often collected locally and arranged in accordance with the embankment slope, thereby creating slip surfaces. Geomorphological processes, which are related to landscape evolution during the Weichselian (glacial and fluvial processes connected with valley formation) and the Holocene (river erosion, coastal abrasion, sheet and rill erosion) have the greatest impact on the formation of slope failures in Gdynia. Human activity in Gdynia contributes significantly to increased erosion processes. Erosion is often concentrated where the forest cover has been removed for urbanization and construction purposes (Egboka et al. 2019). Landslide slopes are more prone to rill and gully erosion because they are clear of vegetation. Gully erosion often develops along ancient landslides, which promotes the concentration of running water (Morgan 2005). Subsequently, small mass movements occur within gullies leading to landslides caused by gully erosion (Conforti et al. 2011). In Gdynia as well, rill erosion and later gully erosion often develop along mass movements, which was observed during fieldwork. Furthermore, PR values indicate the significance of the LS and SPI indexes (Table 2), that are also important variables for potential erosion assessment. Presumably, due to the close relationship and synergy of gully erosion and landslides, analogous passive factors are considered in the analyses of the susceptibility of sheet, rill, and gully erosion, as well as landslides (Magliulo et al. 2008;Magliulo 2012).
Another phenomenon that proves difficult to document is the dynamic environment of an urbanized area. Thematic layers are static; however, even small changes in the values and distribution of the geo-environmental conditions in a city can have consequences for landslide processes or triggers. Anthropogenic factors, which include undercutting, slope gradient change, and ill-considered logging, are crucial to these processes. Reduction of infiltration and retention areas due to development is a characteristic feature of the urbanized parts of Gdynia. In 2004 and 2016, uncontrolled runoff along road surfaces in Gdynia repeatedly activated dangerous landslides. Such processes can be predicted after long-term observations of a single slope failure. However, it is difficult to present these using statistical methods. It will be useful to create a thematic layer representing inclined roads and their buffer zones. The 'distance from roads' layer has already been used in many analyses (e.g. Youssef et al. 2016); however, this layer should be more precise for the analysis of urbanized areas.
Previous studies on landslide susceptibility mapping found that data quality and research depth are the most important aspects that have a principal effect on the accuracy of assessment results (Galli et al. 2008;Liu and Duan 2018). It is difficult to designate very small landslides in urban areas of Gdynia. Despite the considerable time devoted to landslide inventory in this study, the susceptibility analysis did not include several very small earthflows (up to 10-15 m long), for example, those which occurred along the railway embankment in 2016 (Fig. 7). Note that even small slope failures in this area halted rail traffic, disrupting the flow of goods, and proved expensive to restore.
Considering the 'land use and land cover' map applied in susceptibility assessment (Fig. 3), this layer was compiled based on the Urban Atlas (2012). Only one additional class 'coastal zone' was created on that map. This class is very important for the analysis. However, many of the relationships observed during field research (i.e. the presence of landslides near roads, railways, and embankments), have not been shown using statistical methods on the 'land use and land cover' map. This indicates the need for a separate land use map aligning with the occurrences of landslides (e.g. with buffer zones around roads and railways).
According to Longley et al. (2015) GIS tools do not exclude forecasting errors, although even approximate results are of great value in planning processes. Accuracy and the method of special data generalization used in analyses have influences on the quality of the model. The susceptibility analysis described in this work indicates that failure to consider 'invisible' factors (e.g. the factor of landslide stabilizing engineering solutions) within the scale of the map (in this case 1:10,000) can sometimes lead to erroneous conclusions. Another 'invisible' factor is the spatiotemporal change in the shoreline caused by abrasion. These shortcomings can be overcome by supplementing the susceptibility map with additional information on stabilization and the rate of sea abrasion in historical times . Shoreline change analysis can also be used in susceptibility mapping (Montoya-Montes et al. 2012).
Landslide susceptibility assessment is a valuable tool for land planning and reducing the costs of landslides (Fell et al. 2008a, b). Susceptibility maps indicate the most sensitive places in Gdynia, which may be the basis for the use of preventive or protective measures. Furthermore, in zones of high landslide possibility, slope stability may be easily triggered by minor natural or artificial changes. Therefore, appropriate procedures should be developed (e.g. a prohibition on logging, restrictions on development density, regular checks, and maintenance of drainage systems). To facilitate the use of susceptibility maps, a landslide protocol is necessary to link the terrain domains outlined in the terrain zonations to specific actions (Rossi et al. 1982;Van Den Eeckhaut et al. 2009). Design of the appropriate landslide protocol for Gdynia should be prepared involving relevant stakeholders and considering the negative consequences of landslides.

Conclusions
Considering the ongoing demographic trends and potential climate effects, landslide susceptibility assessment in urban areas is vital for reducing future disaster risks. In this study, two techniques presenting landslides, 13 factors, and three widely used statistical methods (SLI, WOE, and LR) were applied to landslide susceptibility mapping of Gdynia, Poland. Consideration of different techniques of presenting landslides and the use of many thematic layers in the statistical approach allows for the selection of the most appropriate models and accurate determination of the role of instability factors in analysis.
Reliable results were obtained using points located on the main landslide scarps. The use of success curves demonstrated lower suitability of entire landslide areas for susceptibility analysis. All analysed contributing factors (slope angle, slope aspect, relief energy, elevation, vertical curvature, LS, TWI, SPI, distances from watercourses and reservoirs, groundwater table level, ground type at a depth of 1 m b.g.l., ground type at a depth of 4 m b.g.l., and land use and land cover) have some impact on the process of landslide formation, but their roles are different. Many of these passive factors are interrelated (e.g. slope angle and TWI) and some of them also relate to active factors, i.e. triggers (e.g. elevation and heavy rainfall).
Factors that contribute significantly to landslides in Gdynia are land use and land cover, and topographic attributes (particularly slope angle, TWI, relief energy, LS, and vertical curvature); geology and hydrogeological conditions play particularly important roles. The modelling results for Gdynia show that landslide susceptibility is the highest in areas with slopes > 14° and the danger of landslides rises as slope increases. Slopes with east-, northeast-, and north-oriented aspects, developed in sediments with lithological diversity in the vertical profile (cohesive and non-cohesive deposits) are equally vulnerable to mass wasting. The coexistence of unconsolidated material over cohesive deposits facilitates rainwater infiltration and the surface of rupture. In addition, the modelling results and field study reveal the significant roles of the embankment. A shallow groundwater table of 0-1 m often determines the surface of rupture. In the young morainic area of Gdynia, geomorphological factors such as river erosion, coastal abrasion, and sheet and rill erosion are crucial for the reactivation of mass movement.
The use of more advanced statistical methods, taking into account in the calculations the absence of landslides (WOE) and the interaction of factors (LR), allows to obtain better results than using the basic statistical method (SLI). The most remarkable result to emerge from the data is that a few independent passive factors (i.e. slope angle, slope aspect, and lithology) are sufficient to present a reliable landslide susceptibility map of Gdynia. Among the three statistical methods used in this study, the SLI method performed poorly in producing the susceptibility map despite its confirmed validity and ease of application. The SLI model prepared with many interrelated controlling factors appears to result in overpredicting. The WOE and LR methods are comparable in terms of their predictive capabilities; however, evaluation of their corresponding models revealed that the LR model exhibits the best landslide susceptibility prediction performance. These results offer compelling evidence for the need to check for independence in the selection of controlling factors and the use of an independent sub-dataset of landslides for validation.