1 Introduction

Landslide inventories are an essential element of landslide hazard assessments. They are the basis for calculating the spatial, temporal and magnitude probabilities of landsliding (Guzzetti et al. 2005). A landslide inventory is a systematic record of landslide activity in a specific area that details the locations and dates of landslides, as well as other attributes that vary from study to study, e.g., landslide length, width, lithological properties, type, stage of activity.

A substantial number of publications use landslide inventories to assess landslide susceptibility (e.g., Legorreta Paulín et al. 2016; Su et al. 2015; Mancini et al. 2010), to calculate the temporal probability of landslide occurrence (e.g., Corominas and Moya 2008; Guzzetti et al. 2005) or to characterize the influence of triggering events (e.g., large earthquakes and intense rainstorms) on landslide formation (e.g., Malamud et al. 2004; Gorum et al. 2011; Hovius et al. 2011). Such studies generally concentrate on the location and extent of landslides, dates of their (re-)activation and their area and/or volume. Geo-environmental attributes, when included, are typically assigned to mapping units that cover the entire study area rather than to individual landslides. However, the attributes of individual landslides are valuable in the analysis of landslide geometry (e.g., for runout modeling), in the determination of landslide types (Van Westen et al. 2008; Costanzo et al. 2012), as a preparatory step for susceptibility modeling, in the model interpretation and in the indirect evaluation of the inventory completeness (Trigila et al. 2010).

Landslide attributes are often determined through field-based investigations or by visual interpretation of remote sensing data (e.g., Longoni et al. 2014; Ghosh et al. 2011; Tang et al. 2016). In such cases, field observations and expert judgment are used to determine a relatively large number of attributes for a comparatively low number of landslides in an area of limited size. However, these techniques cannot readily be adopted for large regional inventories. Such inventories have become more widespread due to the broader availability of suitable satellite imagery, which allowed for the introduction of (semi-)automated methods for inventory generation (e.g., Martha et al. 2013; Mondini et al. 2013). In this study, we determine the attributes of individual landslides contained in a multi-temporal, multi-source inventory featuring over 2800 landslide polygons for a study area in southern Kyrgyzstan (Golovko et al. 2015) that covers more than 12,000 km\(^{2}\). A large portion of the inventory was automatically generated from multi-temporal optical satellite imagery (Behling et al. 2014b, 2016). Automated landslide detection makes it possible to efficiently update the inventory whenever new satellite images become available. Each time that the inventory is updated, it is necessary to determine the attributes of the newly included landslides.

For this purpose, efficient (semi-)automated methods of attribute derivation are needed. However, to our knowledge, a systematic methodology for attribute derivation has not yet been developed. In this study, we use preexisting and newly implemented GIS tools and remote sensing data to automate the process of attribute derivation and make it more objective than manual determination. Because the inventory was compiled from multiple sources of landslide information, we also compare the quality of the different sources and evaluate the effect of the data collection method on the distribution of landslide attribute values. Finally, we analyze the spatial and temporal differentiation of landslide properties. This is the first multi-temporal landslide inventory for the study area that represents landslides with polygons. In a substantial part of the inventory, the polygons reflect the spatial extent of a single slope failure at a particular point in time. Such representation is rare for inventories with several thousands of landslides, particularly for data-scarce regions such as southern Kyrgyzstan. This detailed multi-temporal landslide inventory enables landslide analyses that were not previously possible, such as a systematic multi-temporal study of the geometric parameters of the landslides. The developed methodology is especially suitable for improving the knowledge of landslides in areas with limited and heterogeneous spatial data availability. In principle, this methodology can be applied in any landslide-prone region of the world.

2 Study area

The study area is located in southern Kyrgyzstan at the eastern rim of the Fergana basin at the foothills of the Tian Shan (Fig. 1). The region experienced two major folding stages: the Hercynian and Neotectonic orogeny. The oldest and highest structures are composed of igneous and metamorphic rocks of the Proterozoic and Paleozoic. They encircle the intramountain Fergana basin. At elevations below 2200 m.a.s.l., these consolidated cores are overlaid by Jurassic–Oligocene formations, which consist of folded layers of clays, argillites, sandstones, limestones, marls, gypsum and conglomerates (Havenith et al. 2009). The folding decreases the stability of these structures, and the alternating permeable and impermeable layers in the folds create favorable conditions for water accumulation and the formation of sliding surfaces. Quaternary loesses of varying thickness cover these units. The core of the basin is occupied by younger unfolded structures (Wetzel et al. 2000). The lithological composition of the stratigraphic units is shown in Table 1.

Fig. 1
figure 1

Study area in southern Kyrgyzstan and locations of the landslides contained in the multi-source inventory

Table 1 Lithological units in the study area (following Roessner et al. 2006)

Large rotational and translational slides in the areas of Jurassic–Oligocene folded sedimentary rocks have affected settlements and infrastructure the most. Landslides in massive Quaternary loess units are particularly dangerous due to their rapid avalanche-like movements. They combine rotational and dry flow movement with long runout zones and are most common in the northern part of the study area, where loess deposits reach their maximum thickness (Roessner et al. 2005). Some of these landslides are part of large deep-seated slope deformations (Bedoui et al. 2009; Lebourg et al. 2014; Zerathe and Lebourg 2012) that experience persistent and very slow movements in their core part, whereas the Quaternary loess cover on top is subjected to faster and discrete slope failures. The less abundant slope failures in basement units occur predominantly in the form of shallow landslides. Large rockfalls are present in the vicinity of faults. The Upper Neogene and Lower Quaternary sediments are rarely affected by landslides. They are composed of conglomerates and weakly consolidated gravels with interbedded loess-like loams. These deposits are more stable due to their high porosity and permeability (Wetzel et al. 2000).

The major faults in the study area are the Madin-Taldi-Suu, Aldiyar, South-Nookat, North- and South-Kuvin fault systems (Feld et al. 2015; Lemzin 2005). Areas where the major faults converge are particularly landslide-prone, e.g., the slope southeast of the town of Uzgen. The study area is subjected to high and ongoing tectonic activity expressed by frequent but relatively low-magnitude earthquakes (Haberland et al. 2011).

The study area is characterized by increased precipitation levels compared to the territories situated further east due to the function of the Tian Shan as an orographic barrier. Snow cover is typically established in the winter months with significant annual variations (Ibatulin 2011). Monthly precipitation levels reach their maximum in March with a second peak in November (Torgoev et al. 2010). The precipitation maximum in March often coincides with the onset of snowmelt further increasing surface water infiltration into the ground. This leads to an increase in groundwater levels as one of the major factors for the onset of deep-seated slope failures, which represent the main type of landslides in the study area.

The temporal relationships between the occurrence of landslides in the region and variations in the hydrometeorological and seismic conditions are complex. Well-defined single triggering events are not typical for the region. Rather, the onset of slope failures is caused by the cumulative effect of these factors over prolonged periods of time (Danneels et al. 2008) and their interaction with predisposing factors, such as the lithological and tectonic structures. Most landslides occur in the months from March to May as a result of snowmelt and increased precipitation during this period (Torgoev et al. 2010). Significant variations in the process intensity over the years have been observed, which have been linked to differences in the amount of winter precipitation and the specific characteristics of snow accumulation (Ibatulin 2011; Torgoev et al. 2010).

The interrelations between these factors and the mechanisms of landslide initiation in the region have not yet been sufficiently determined. Therefore, precise spatio-temporal information on landslide occurrence is important for a better understanding of the complex mechanisms for the onset of slope failures in this region.

3 Landslide inventory maps

In Kyrgyzstan, slope failures result in numerous casualties and economic losses. From 1973 to 2010, landslides in southern Kyrgyzstan caused 211 fatalities (Ibatulin 2011). The local Ministry for Emergency Situations (MES) has been conducting landslide investigations in this region since the 1950s. These investigations include repeated field surveys of known landslide-prone slopes and landslide catalogization efforts, e.g., Yerokhin (1999) and Ibatulin (2011), and in recent years, field landslide mapping in selected districts by the Central-Asian Institute for Applied Geosciences (CAIAG). These records provide valuable insights into the properties of landsliding in the region. However, they represent an approach that relies on field investigations and expert knowledge. Moreover, the investigations of the MES have been conducted for the purpose of disaster mitigation and have thus been limited to parts of the study area with a risk to the population and infrastructure. Furthermore, no regular landslide monitoring has been conducted in the region since 1991 due to the lack of funding following the collapse of the former Soviet Union; consequently, the completeness of the landslide record varies considerably for different periods. An objective and comprehensive landslide mapping with a uniform coverage for the entire study area has not yet been performed. To address these issues, we have compiled a more comprehensive inventory that combines archive information, field mapping and the identification of landslides in satellite images. The inventory consists of four components, which are presented in Table 2.

Table 2 Landslide data sources

The first component is based on the report by Ibatulin (2011), which verbally describes a small number of particularly large, destructive or otherwise remarkable landslides in the study area. The report is a result of 40-year-long landslide investigations by the MES. It specifies the dates of most recorded landslide events as precisely as the day of the failure. Additional effort was necessary to transfer these textual descriptions into spatial data in the form of polygons. A set of high-resolution satellite imagery (Behling et al. 2014b), topographic maps and field visits were useful in this process. Often, the spatial extent of the landslides could only be determined approximately, particularly if a long time had passed since their failure. Despite the limited spatial accuracy, this data source provides the most precise information on the landslide failure dates.

The second inventory component, field mapping by GFZ, represents the result of regular visits to the study area by scientists from Section 1.4 of the GFZ German Research Centre for Geosciences since 1998. Most landslide polygons in this data source represent a cumulative effect of multiple slope failures without a differentiation of their failure dates and may include very old slope failures. The field mapping results are complementary to the previous inventory component, i.e., landslides were not included in the field mapping dataset if they were already mapped after Ibatulin (2011). These two sources of landslide data together will henceforth be referred to as ‘field-based sources’ in this paper.

To obtain the third and fourth inventory components, an automated approach to the detection of landslides from time series of satellite images was developed by Behling et al. (2014a, b). A co-registered database of optical satellite imagery was created with a total of 729 images acquired in 1986–2013 by various satellites (Landsat, SPOT, ASTER and RapidEye) with a spatial resolution ranging from 30 m for Landsat to 5 m for RapidEye (Behling et al. 2014b). The temporal resolution varies between 6 years at the beginning and several weeks at the end of the covered period, whereas at least yearly coverage is available since 1996 with a gap in 2006. A multi-temporal automated approach was applied to these images, which compares pixel NDVI values over time and detects typical landslide NDVI trajectories with an abrupt vegetation decrease at the time of the failure and slow revegetation afterward (Behling et al. 2014a, 2016). Whereas bi-temporal change detection enables the creation of event-based or seasonal inventories, the multi-temporal automated detection approach enables the creation of a multi-temporal inventory, which is more suitable for our study area. If multiple landslide events occur within the same slope and their bodies overlap, their extent can still be correctly determined if an image is available after each of these events.

This approach allows a high degree of spatial and temporal precision and completeness to be achieved and ensures an objective mapping method with equal consideration of the complete study area. The automated landslide detection has been performed (1) for the entire study area with high-resolution RapidEye images for the period of 2009–2013, resulting in 624 landslide polygons (‘short-term automated landslide detection’), and (2) for a subset of the study area using imagery from all sensors for the period of 1986–2013 with 1583 detected slope failures (‘long-term automated landslide detection’). The RapidEye images used for short-term detection that covered this subset were also included in the long-term detection (see Fig. 1).

The first (based on Ibatulin 2011) and second (field mapping) inventory components reflect the conventional approach to landslide mapping based on expert knowledge. Because expert investigations are hardly possible after every single landslide event, at least some of the resulting landslide polygons may not coincide with the spatial extent of single events (Marc and Hovius 2015). This problem is aggravated if the study area is large and/or difficult to access or if few resources are available for landslide mapping, as is the case in southern Kyrgyzstan. New automated approaches that use multi-temporal satellite imagery can delineate individual failures more accurately. The correspondence between the representation in the inventory and the actual spatial extent of landslide events is not always considered when landslide attributes are determined. The availability of data from different sources in this study makes it possible to compare the influence of the data acquisition method on the attribute values.

Fig. 2
figure 2

Landslides from various data sources for a subset of the study area in the Budalyk river valley and an example of a slope failure

Figure 2 shows a subset of the multi-source inventory for an area southwest of the village of Gulcha around the Budalyk river valley. Of all inventory components, the results of field mapping are characterized by particularly large areas affected by landsliding, but the dates of the failures are not specified. Conversely, the results of automated detection contain polygons of smaller sizes because they are able to delineate the spatial extent of a single slope failure and the period of its occurrence. Landslides digitized after Ibatulin (2011) are the least numerous, but their failure dates are documented with the highest temporal precision. These data sources supplement each other due to their different levels of spatial and temporal detail. The resulting multi-source inventory is the most extensive landslide record available for the study area in southern Kyrgyzstan. The representation of landslides in the form of polygons is important both for the more precise spatial localization of landslide objects and for conducting types of analysis that are not possible using point-based data.

4 Methods

Depending on the properties of the study area and research objectives, the list of attributes in a landslide inventory can vary (Wieczorek 1984; Cruden and Varnes 1996; Van Den Eeckhaut and Hervás 2012; Xu 2015; Schlögel et al. 2015; Pellicani and Spilotro 2015). Attribute determination often relies on field visits and expert knowledge, which cannot be provided for the inventory with over 2000 landslides. Rather, we use an automated GIS-based approach to derive selected attributes that characterize landslide geometry and geo-environmental properties.

Our approach combines standard and customized (newly implemented) GIS functionality described in the following subsections. Standard GIS procedures were performed using the tools of the proprietary ArcGIS 10.0 and open-source QGIS 2.8 software packages. The customized functionality required for some of the attributes was implemented in the Python programming language as a QGIS plug-in (QGIS Development Team 2015).

For relief-related analyses, we use the freely available digital elevation data in the form of ASTER GDEM Version 2 (2011) with 30 m resolution. Due to the predominantly treeless character of the vegetation in the study area, this digital surface model can be used to represent the Earth’s topography. A more detailed DEM would be desirable. However, because we aim at an automated approach that can efficiently calculate attributes for a large number of landslides in an inventory, we condone the imprecisions for smaller landslides. Furthermore, the proposed analysis can be conducted again with little additional effort as soon as more detailed digital elevation data become available for the study area, or it can be applied to a less data-scarce region where such a DEM is already available.

Because a lithological map in a sufficient spatial scale (1:250,000) is not available, we use the geological map because the geological formations contained in this map are characterized by specific lithological composition and behavior (see Table 1 for more details). The geological map is a result of a combination and expert reinterpretation of several scanned and geo-referenced 1:200,000 paper maps, which were published in the Soviet Union prior to 1991. The representation of Quaternary units in this map is incomplete, which does not allow loesses (which are very important for landsliding) to be distinguished from other Quaternary deposits.

4.1 Geometrical landslide properties

Area and perimeter are common landslide attributes. They can easily be determined using the standard functionality of any GIS software. By contrast, the calculation of the length and width of a landslide polygon is more complex, despite the apparently simple concept of length and width. In the following subsections, we present our customized method for automated landslide length and width calculations and demonstrate how the length and width information can be further used to calculate the landslide compactness as an indirect indicator of the landslide movement type.

4.1.1 Landslide length and width

One of the methods for calculating the length and width of a complex polygon is to derive an enclosing ellipse and use the two axes of the ellipse as the polygon length and width (Burger and Burge 2009). For landslide-related applications, this method may result in errors, particularly for rotational slides and small failures, when the width of a landslide is greater than or comparable to its length. Our approach is based on the fundamental property of gravitational mass movements to move down the slope. We combine landslide polygons with raster digital elevation data to derive the landslide length. This calculation method can be executed automatically for a large number of landslides.

Fig. 3
figure 3

Approach to landslide length and width calculation: a landslide polygon with its highest and lowest points, b length line connecting the highest and lowest points, its perpendicular bisector and the new point in the middle of the intersection of the perpendicular bisector with the landslide polygon, c, d the same procedure is performed with each new segment of the length line that is not completely inside the landslide polygon, e final length line, and f width lines calculated using perpendicular bisectors of the length line segments

In the first stage, the highest and lowest points are calculated for each landslide polygon (Fig. 3a). This is achieved using a combination of vector landslide data with raster digital elevation data. The DEM raster is clipped to the landslide polygon in question, and the center of the pixel with the highest or lowest elevation value is calculated.

In the second stage, a line (‘length line’) is drawn between these two points. If this line is completely inside the landslide polygon, then the algorithm stops, and the length of the line is considered to be the landslide length. Otherwise, new points need to be added recursively to the length line. In each recursive step, a perpendicular bisector of the length line is constructed (Fig. 3b). A new point is added to the length line at the middle of the intersection of the perpendicular bisector with the landslide polygon. Each of the two new segments of the length line is then subjected to the same procedure, and new points are added until the complete length line is inside the landslide polygon (Fig. 3c–e).

The segments that result from the intersections of the perpendicular bisectors with the landslide polygon are used to determine the width of the landslide polygon at multiple positions (Fig. 3f). The mean, median, maximum, minimum and standard deviation values of all width segments associated with the same landslide are then recorded in the attribute table of the landslide shapefile to characterize the width of each landslide polygon as a whole.

4.1.2 Compactness

Compactness is a measure of how wide or narrow a landslide polygon is relative to its length. It also offers a way to indirectly describe the type of landslide movement. The movement type is often included as an attribute in landslide inventories. The most reliable approaches for determining compactness are field studies and expert judgment. However, the direct determination of the movement type for landslides detected using remote sensing is difficult. Furthermore, the large size of the inventory and its updatable character make expert evaluation a labor-intensive task, which is why an indirect inference on the landslide movement type from the landslide shape using the measure of compactness can be useful. In addition, the compactness value can point to the relationship between the landslide area, which is easy to derive in a GIS, and the landslide volume because more compact landslides tend to have larger volumes for the same area (Havenith et al. 2015).

A common approach for calculating the compactness of a polygon is the isoperimetric quotient Q (Osserman 1978), also referred to as elongation factor (Havenith et al. 2015):

$$\begin{aligned} Q = \frac{4 {\pi } A}{P^2}, \end{aligned}$$
(1)

where A is the polygon area and P is the polygon perimeter. The quotient can have values in the range (0, 1], with a value of 1 corresponding to the most compact shape, a circle, and lower values indicating less compact polygons.

However, using the isoperimetric quotient is problematic in the case of the multi-source inventory for southern Kyrgyzstan. This is because the landslides that were automatically detected from remote sensing imagery preserve their pixel-based zigzag boundaries, leading to substantially higher perimeter values that are incomparable to other sources of landslide data. Therefore, we have used the landslide length rather than the perimeter to calculate compactness:

$$\begin{aligned} C = \frac{A}{L^2}, \end{aligned}$$
(2)

where A is the polygon area and L is the polygon length. In the general case, when the length is the most extended dimension of a polygon, the range of this function is \((0,\,0.25\pi ]\). In the present application, however, the landslide length is not necessarily the largest dimension of the landslide polygon; thus, the range of the compactness function is a right-open interval. Similar to the isoperimetric quotient, greater values indicate more compact landslides.

4.2 Geo-environmental landslide properties

Information on geo-environmental landslide properties is typically available in the form of vector or raster maps that cover territory beyond the extent of a single landslide. Therefore, the challenge in the process of determining geo-environmental landslide attributes for a polygon-based inventory is the transition from the terrain properties to the properties of a single landslide. A landslide inventory can encompass a variety of geo-environmental attributes. In this study, we derive four attributes of this group: slope, aspect, geology and distance to stream.

Landslide slope values are calculated as the mean of the slope values of all pixels that are within a landslide polygon. However, this approach is inapplicable for aspect. Aspect is an example of data on a circular scale. Thus, the addition and division of multiple aspect values ranging from \(0^{\circ }\) to \(360^{\circ }\) to find their mean is unreasonable. Rather, we classify the aspect raster into eight classes for each cardinal and intercardinal direction. Then, we use the ‘majority’ option of the zonal statistics function in ArcGIS, which adopts the value of the most frequent class as the aspect attribute of the landslide.

In this study, geology is available in the form of a vector map. A landslide polygon can intersect several geology classes; in fact, the succession of stratigraphic units along the slope is common for many slopes. Therefore, one is confronted with several potential geological classes for each landslide. A straightforward solution is to select the class that occupies the largest area within the landslide polygon. However, this approach can result in errors when the largest class is not representative of the conditions that led to the slope failure. Instead, we suggest using the landslide highest point (see Sect. 4.1.1) because the geological conditions at the landslide main scarp are crucial for the initiation of most landslides in the study area. This approach leads to a better representation of more rare classes, which might be absorbed into larger units when the maximum-area method is used.

To calculate the distance to stream for each landslide, we derived the stream network from the DEM using the standard GIS routine that involves the derivation of flow direction and flow accumulation. A 1:100,000 topographic map was used to correctly set the threshold on the flow accumulation raster and to differentiate stream pixels from non-stream pixels. Subsequently, the Euclidean distance between the landslide polygon and the stream was calculated using the ArcGIS tool ‘Near.’ For each landslide, this tool identifies two closest segments, one from the landslide polygon and one from the stream dataset, and calculates the shortest distance between them (ArcGIS Help 2016).

5 Results

The attributes area, length, compactness, mean slope, dominant aspect, geology and distance to streams were successfully derived for the multi-source inventory. The length lines and values were manually corrected for some landslides. Most of them were very small or had complex geometries.

The attribute distributions were analyzed separately for each landslide data source to investigate the effect of the mapping method on the properties of the landslide inventory (Sect. 5.1). Furthermore, the degree of completeness was compared between the inventory components. In the second step, the spatial and temporal differentiation of the attributes was examined (Sects. 5.2, 5.3).

5.1 Variations of landslide properties by data source

The landslide properties differ depending on the source of the landslide data. Figure 4 illustrates the differences in landslide sizes (areas). In our study, the automated detection results have the highest frequency on the logarithmic scale in the range of \(10^{3}\)\(10^{4}\,\hbox {m}^{2}\), whereas the two field-based data sources already exhibit a decrease at this size range. Thus, smaller landslides are underrepresented in the field-based sources. The histograms for both automatically derived datasets are similar, but smaller landslides are better represented by the automated detection results for 2009–2013. This is because the years 2009–2013 are characterized by relatively low landslide activity compared to the entire time period of the long-term analysis (1986–2013). The specific shape of the histogram for the long-term results with a peak around \(10^{4}\) m\(^{2}\) is due to the lower resolution of satellite images in earlier years, which did not permit such an extensive mapping of small landslides as was possible in the more recent periods. Landslides after Ibatulin (2011) have a narrow distribution around larger sizes. Landslides mapped during the field campaigns include a wider range of sizes.

Fig. 4
figure 4

Distribution of landslide area values (decimal logarithm of landslide area) by landslide data source. a Data after Ibatulin (2011), b field mapping, c automated detection 2009–2013, d automated detection 1986–2013

With this in mind, the distribution of further landslide attributes (Fig. 5) can be analyzed. The larger landslide complexes documented in the field-based data sources are characterized by greater lengths, gentler slopes, and their toes are located closer to streams. These tendencies are even stronger for the landslides mapped after Ibatulin (2011) than for the field mapping results. The distribution of compactness values does not exhibit significant differences between the data sources, which indicates the independence of the compactness calculation from the landslide mapping approach.

Fig. 5
figure 5

Distribution of landslide attribute values by landslide data source. a Length, b compactness, c slope, d distance to stream

The distribution of landslides by geology class is similar for all landslide data sources. It is characterized by the prevalence of landslides within Cretaceous and Paleogene units (Fig. 6). Basement units and the unfolded Neogene platform sediments are less favorable for the development of landslides in the study area. A relative increase of landslides in the basement units is observed for the results of automated landslide detection in 2009–2013, which is due to the lower landslide activity in this period. This decrease in landslide activity was more significant for areas composed of weakly consolidated sediments than for the metamorphic basement rocks. Furthermore, many landslide locations had not been mapped prior to the availability of the automated detection method, and this especially applies to landslides outside of the major landslide hotspots.

Fig. 6
figure 6

Landslide occurrence related to geology class by landslide data source (for lithological characterization of the geology classes, see Table 1). Red columns (in the middle) show landslide number, blue columns (on the right) show landslide area, and black columns (on the left) show the distribution of geology classes in the study area or in the study area subset. a Data after Ibatulin (2011), b field mapping, c automated detection 2009–2013, d automated detection 1986–2013

Figure 7 shows the distribution of the landslides by aspect for each data source. For all the data sources, the landslide activity is higher on the northern, northwestern and northeastern slopes than on the southern, southwestern and southeastern slopes. The automated landslide detection results show the dominance of the northeastern aspect in the distribution of landslide areas. That is, large landslides particularly occur on northeastern slopes. This tendency is milder for landslides described in Ibatulin (2011), and the northwestern direction prevails for the field mapping results in terms of both number and area. An explanation for this phenomenon is that the northern parts of the study area, particularly the regions of Kara-Unkyur and Downstream Kurshab with thick loess deposits, have experienced strong landslide activity since the start of landslide observations by the local authorities and especially during the very active year 1994. Due to the orientation of the river valleys and the patterns of loess accumulation in these regions, northwestern slopes have experienced the most landslides. In 2009–2013, however, no substantial landslide activity occurred in the northern part of the study area. Instead, the most important landslide hotspot in the period of regular satellite observations after 1998 was the slope southeast of the town of Uzgen. Because no settlements are located in the runout zone of this very active landslide complex, it received little attention from the local authorities compared to the magnitude of the landslide activity it experienced.

Fig. 7
figure 7

Distribution of aspect values for different landslide data sources. Red lines show landslide number, blue lines show landslide area, and black lines show the distribution of aspect classes in the study area or in the study area subset. a Data after Ibatulin (2011), b field mapping, c automated detection 2009–2013, d automated detection 1986–2013

5.2 Spatial variations of landslide properties

Thus far, the landslide data have been examined by information source with no further differentiation. However, the study area in southern Kyrgyzstan is not homogeneous. For the spatially differentiated analysis, the study area was subdivided into ten regions based on natural watershed boundaries (Table 3). The spatial differentiation of landslide properties in the study area was analyzed using the automated detection results for the period 2009–2013 (624 landslides) because this dataset covers the study area in the most systematic way, regardless of how remote individual parts of it may be.

Table 3 Distribution of landslides by region and data source

An association between the dominant geology and aspect classes can be traced over all the regions (Figs. 89). The prevalence of landslide-prone geological structures on particular sides of the river valleys often explains the dominating aspect values. For example, in the Kara-Unkyur valley, most landslides are located on the left river bank in folded Cretaceous and Paleogene deposits (in contrast to the right river bank composed of more stable unfolded sediments), and therefore, the northwestern landslide aspect is prevalent. The same is true for Downstream Kugart with its northwestern, western and eastern orientations of river valley sides with Cretaceous and Paleogene sediments. In Upstream Kugart, the northern and southern aspects are the most prominent, which reflects the course of the river from east to west at lower elevations, whereas the westward slopes in the higher upper part are composed of less landslide-prone basement rocks. In Downstream Kugart, the northwestern, northern and northeastern aspects are dominant due to the southern orientation of the slopes with basement rocks. In Changet and Yassi, Kara-Darya, Upstream Kurshab and Batken geology classes are distributed more or less evenly among the aspect classes; thus, the dominant aspect classes in the landslide data reflect the dominant orientations of the river valley sides. For the Tar and Ak-Buura and Taldik regions, landslides exhibit a similar distribution of aspect classes as the region as a whole with a slight tendency toward the north, which could be linked to more favorable conditions such as higher moisture and longer snow cover duration on north-facing slopes. The close association between aspect and geology values in these areas reveals that the structural setting is the dominant factor determining the spatial differences in landslide occurrence.

Fig. 8
figure 8

Landslide occurrence related to geology class by region (for lithological characterization of the geology classes, see Table 1)

Fig. 9
figure 9

Distribution of landslide aspect values by region

Some regions within the study area exhibit landslide characteristics that are comparable to those derived for the study area as a whole (Figs. 8, 910). Such regions are located at lower elevations and include downstream parts of the basins of the principal rivers in the study area: Kara-Unkyur, Downstream Kugart and Downstream Kurshab. Landslides in the Kara-Darya and Tar regions differ from the study-area-wide averages only in their greater distance to streams due to the presence of landslide-prone slopes with high relative heights and reactivations in the upper slope parts.

Fig. 10
figure 10

Distribution of landslide attribute values by region. a Area, b length, c compactness, d slope, e distance to stream

Regions located in the upstream parts of the river basins differ more due to their geological composition. The Batken area in the southwest is particularly different: Almost all landslides here develop in basement rocks and on steeper slopes than in other regions. The landslide processes here are dominated by shallow landsliding and debris flows. The large share of shallow landslides is reflected in their short length and compact shape. In fact, it is the inclusion of this region in the automated analysis that increases the share of landslides in basement units in the automated detection results for 2009–2013 (Fig. 6).

A feature of the Ak-Buura and Taldik regions is the abundance of landslides in Neogene rocks, which are not prone to landsliding in other parts of the study area. A possible explanation is the incomplete inclusion of Quaternary sediments in the geological map and the high tectonic activity, which leads to instabilities of slopes that are otherwise stable in other parts of the study area.

Upstream Kugart is well characterized by a higher share (approximately one-third) of landslides in basement units. This region also exhibits particularly low distances to streams (Fig. 10e), which is because approximately 40% of the automatically detected landslides here are actually a result of the Kugart river and its tributaries undercutting the river banks. Furthermore, many landslides (at least a quarter) are reactivations within large old landslide complexes at a distance from the main scarp, which explains the tendency toward low slope values for landslides here.

The geological composition of the Upper Kurshab region is similar to the study area as a whole. However, the region exhibits the highest median value of the landslide length and the lowest median value of compactness, which points to the prevalence of flows in this region. This may be a result of steeper slopes, which are a manifestation of high tectonic activity.

Thus, the substantial differences in the attribute value distributions observed in some regions are mostly a result of the presence of different types of gravitational mass movement processes in these parts of the study area. Regions where the distributions of the attribute values are similar to those for the complete study area are characterized by larger landslides and higher landslide density. Differences in the geological composition of the territory play the dominant role in the variability of landslide properties between the regions. The systematic consideration of all parts of the study area in the process of automated landslide detection reveals the large variety of gravitational mass movement processes in the study area, which had not been fully considered before in the frame of the field-based landslide data sources. This once again highlights the subjectivity of these data sources and their focus on the identification of landslides in already known landslide hotspots of certain prevailing landslide types.

5.3 Temporal variations of landslide properties

To analyze the change of landslide properties over time, the automated detection results for the period 1986–2013 (1583 landslides) were used. This analysis covers a \(75\times 50\) km large subset of the study area south of the Tar river (see Fig. 1). The landslides were divided into 11 periods based on the acquisition dates of the imagery used for the automated detection. This division (Table 4) was chosen because it can be applied to the complete subset of the study area while preserving most of the temporal detail for the landslide data. The boundaries of the 11 periods were chosen in summer or late spring such that the time of intensive landsliding during and after snowmelt fits completely within a single period.

Table 4 Landslide number and frequency by period

The breakdown of these 1583 landslides by time periods is shown in Table 4. Due to the varying length of the time periods, the landslide numbers and frequencies were divided by the length of each period for better comparability in columns D and F. Eight landslides were assigned to no-period because the time of their occurrence could only be determined with lower temporal resolution and thus corresponds to multiple time periods. The peaks of landslide activity were in the periods 2001–2003 and 2003–2004, when the yearly landslides numbers were 4–5 times higher than the long-term average. These observations are consistent with the information of local experts (Torgoev et al. 2010; Ibatulin 2011) that the winters of 2002–2003 and 2003–2004 were especially rich in snow. Furthermore, there was a less significant increase in landslide activity in 2009–2011. The periods prior to 1998 have few landslides, which contradicts the fact that the year 1994 witnessed an exceptional number of landslides in the study area (Ibatulin 2011). This inconsistency is due to the lower spatial and temporal resolutions of the satellite imagery available for the earlier years of the analysis, thus leading to incomplete landslide detection.

Despite the significant differences in the intensity of landslide activity between the periods, the landslide attributes vary less in time than they do spatially between different parts of the study area. The landslide geometry attributes (Fig. 11a–c) reflect the lower resolution of satellite imagery used in the automated landslide detection for the periods prior to 2001. These periods are characterized by a higher median area and length of the landslides than the periods after 2005. Large landslides follow the same pattern of temporal distribution as the entire dataset, with peaks in the periods of maximum landslide activity and a decrease in periods with less landsliding. However, landslides on average become larger in the most active periods, independently of the image resolution (Fig. 11a, b; Table 4).

Fig. 11
figure 11

Distribution of landslide attribute values by period. a Area, b length, c compactness, d slope, e distance to stream

The geology values do not exhibit significant differences between the periods (Fig. 12). During the peak periods of landsliding, the number of landslides increases for each stratigraphic unit, but the increase for the Jurassic–Oligocene folded sediments is even stronger. Altogether, 50% of the landslides recorded in 1986–2013 occurred in the two periods 2001–2003 and 2003–2004, which correspond to the abundant winter precipitation in 2002–2004 (Ibatulin 2011). However, a 51–65% increase was observed for the Jurassic–Oligocene rocks, and only a 13–42% increase was observed for basement units. That is, the folded Jurassic–Oligocene units are more responsive to changes in the hydrometeorological factor. The saturation of permeable rocks with water in wet years activates the sliding surface in folds where permeable deposits are located above impermeable sediments and makes those areas particularly susceptible to landsliding.

Fig. 12
figure 12

Landslide occurrence related to geology class by period (for lithological characterization of the geology classes, see Table 1)

The temporal changes in the distribution of aspect values do not exhibit a particular trend, except that their distribution in periods with more active landsliding is closer to the long-term average (Fig. 13). The slope values tend to be higher in periods with lower landslide activity, especially in 2005–2007 (Fig. 11d), when a particularly large share of the landslides occurred in basement rocks that correspond to a steeper terrain. The distance to stream attribute exhibits low values for the periods 1990–1998 and 1998–2001 (Fig. 11e) due to the larger size of landslides in these periods. Higher values are found in 2001–2003 because of many smaller landslides in Upper Oligocene–Neogene platform deposits away from streams and in 2012–2013 due to the prevalence of small landslides in this period.

Fig. 13
figure 13

Distribution of landslide aspect values by period. a 1986–1990, b 1990–1998, c 1998–2001, d 2001–2003, e 2003–2004, f 2004–2005, g 2005–2007, h 2007–2009, i 2009–2011, j 2011–2012, k 2012–2013

6 Discussion and conclusions

We derived a set of landslide attributes for a multi-source multi-temporal inventory with over 2800 landslide polygons for the study area in southern Kyrgyzstan using a GIS-based approach. Standard GIS tools were combined with newly implemented functionality. The latter was used to calculate the landslide length, width, compactness and geological properties based on the highest and/or lowest points of the landslide polygon. These tools have been successfully applied to the landslide inventory in southern Kyrgyzstan. However, they can be applied to any landslide database that contains landslides in the form of polygons. For the derivation of topographic attributes, the 30 m cell size of the ASTER GDEM was sufficient for most landslide polygons, but problems arose with some of the smallest landslides. The applicability of the developed approach in other regions depends on how the landslide size in the region compares to the resolution of the available spatial data. A lower resolution of the available thematic maps and of the DEM is more tolerable for regions dominated by very large landslides, but it is more critical for study areas with smaller landslides.

The proposed algorithm for the automated calculation of landslide length and width consists of fitting a polyline inside the landslide polygon to represent its length. The algorithm performed adequately for the majority of the landslides. Unsatisfactory results were produced for the smallest landslides and those with irregular shapes, which required manual correction. Another problem occurred for automatically detected landslides with very long runout zones located on flat terrain or along the river bed. The proposed length calculation method may result in an overestimation of the length of such landslides. Such cases can be detected if an alternative length calculation is performed based on the elevation drop and the mean slope of the landslide polygon. In a recent publication, Nikulita (2016) proposed a method for the automated calculation of landslide length and width using minimum oriented bounding boxes of the landslide polygons and a DEM. His approach is also based on the combination of elevation and vector data, and it differentiates between long versus wide landslides to handle them differently.

The integration of multiple sources of landslide data in the inventory revealed that landslide polygons that were automatically detected from satellite images have more irregular zigzag boundaries than polygons digitized by hand. This difference led to the selection of the formula for compactness calculations in this study. It may also be important for the derivation of further geometric attributes in other cases.

Collaboration with local specialists is necessary to adjust the selection of the attributes to their needs. Furthermore, steps toward the development of a unified methodology for the determination of selected attributes may be beneficial. Guzzetti et al. (2012) addressed the need for standards and the definition of best practices for the preparation of landslide inventories. This should involve not only the documentation of landslide locations and extents but also the methodology for attribute determination. This process could include the implementation of standardized procedures in the form of open-source code that can be shared among interested users and run as extensions for popular GIS software.

The multi-source character of the inventory used in this study enabled investigating the influence of landslide data sources on the distributions of the derived attributes. In the case of the two field-based sources, the inability to differentiate between single slope failures and the omission of smaller landslides led to a bias toward features typical for very large landslides, e.g., greater length, more gentle slope values and shorter distance to streams. Due to the concentration of field campaigns on areas that had been previously known to experts as being landslide-prone, the documentation of some landslide types was more complete than that of other types. In turn, the short-term results of the automated detection suffer from the short time period covered since the period 2009–2013 was characterized by low landslide activity. Because many studies use a single inventory data source, it is important to understand possible distortions of the landslide data. The more complete the landslide inventory, the more reliable the results of the consecutive hazard assessment. The availability of methods that can automatically detect landslides from satellite imagery is crucial for systematic analyses of landslide occurrence. This will be even more possible in the future because the availability of suitable optical remote sensing data will continue to increase. The recent launch of the Sentinel-2 system will provide new opportunities for acquiring imagery with sufficient spatial and temporal resolutions.

The present study enabled a quantification of the temporal variations of landslide occurrence. It was revealed that the spatial variations of landslide properties within the study area have been more significant than their differentiation in time. In the spatial aspect, the geological setting is the major factor that influences the distribution of landslide activity in the study area. In the temporal aspect, periods with the highest number of landslides also tend to have larger landslide sizes. For the assessment of the landslide hazard, this implies that similar factors influence the temporal and magnitude probability of landsliding. The lower spatial resolution of the satellite imagery available prior to 1998 has an effect on the distribution of attribute values over time, which reduces their comparability between the earlier and later time periods.

Because the results of long-term automated landslide detection from satellite imagery are only available for a \(58\times 29\) km large subset of the study area, the comparison of landslide attributes in time reflects the properties of this subset only. It would be desirable to extend the investigation of the long-term variation of landslide properties to the entire study area. However, the data availability is not sufficient for all parts of the study area. A better characterization of the geo-environmental properties of the study area, such as a geological map and a DEM of higher resolution, would also improve the derivation of landslide attributes. The geological map used in this study does not provide a systematic record of the superficial Quaternary deposits. A map of Quaternary deposits with an adequate representation of loess in the study area would be advantageous.

Due to the possibility of updating the inventory, it would be interesting to investigate the distribution of the landslide attributes based on a longer landslide record. The existing inventory reflects the period of high landslide intensity between the years 2002 and 2004. However, another strong peak of landslide activity in 1994 is not fully represented in the inventory. This is because the multi-temporal remote sensing data coverage in the most severely affected northern part of the study area was not sufficient to perform long-term automated detection of landslide occurrence. Furthermore, the rather low spatial and temporal resolutions of satellite images available for this period might have prevented the detection of some slope failures within the subset. If such peaks of landslide activity occur in the future, the availability of remote sensing data would allow a more complete documentation. In that case, the understanding of the temporal variability of landslide properties can be improved and compared to the changes in the hydrometeorological and seismic parameters over time. The availability of methods for efficient attribute derivation, which this paper addresses, is the basis for updating the landslide attribute values and performing these comparisons.

The knowledge on landslide properties acquired by analyzing landslide attributes provides the basis for defining and interpreting multivariate statistical models used in the calculations of landslide susceptibility and temporal probability of landsliding. In southern Kyrgyzstan, the attribute analysis has shown that some regions within the study area differ significantly from the rest. If an analysis of landslide susceptibility is to be performed, our findings indicate that it may be necessary to subdivide the study area into several parts and set up the models separately for each of those parts to accommodate the existing regional variability within the study area.