Introduction and background

Landslide dams are well-known but relatively poorly studied. They are defined here as significant,Footnote 1 ephemeral or enduring blockages of a watercourse by a landslide, and can be analysed as part of multi-hazard, cascading slope-river systems where dam breaching and downstream inundation can be extremely damaging and dangerous (Costa and Schuster 1988; Davies et al. 2007; Dunant 2019; Fan et al. 2020). For example, the Dadu River landslide dam in China breached 10 days after formation in 1786 and killed over 100,000 people downstream (Dai et al. 2005).

In this communication, we present version 1.0 (v1.0) of the New Zealand Landslide Dam Database (NZLDD)—the most comprehensive database of landslide dams in Aotearoa New Zealand to date, including landslide dams triggered by the 2016 Kaikōura Earthquake. We describe the database architecture and design, list compiled datasets, present basic statistics on the database, and provide a detailed data dictionary that includes a categorical measure of data quality. Of the > 1000 landslide dams in the database, we selected a representative subset of 265 sites that were assessed in detail. Key quantitative variables and their definitions, such as landslide and dam dimensions and the Dimensionless Blockage Index (DBI), are also included for this subset in NZLDD v1.0. The database is published on the Open Science Framework platform (https://osf.io).Footnote 2 It will be used in subsequent analysis to investigate landslide dam formation potential, stability, and breaching mechanisms.

Database architecture and design

The NZLDD comprises an ArcGIS geodatabase in New Zealand Transverse Mercator 2000 (NZTM2000) projection and contains five feature classes and one table. The landslide dam site point feature class (DamSite) indicates the location of the landslide dam along the river centreline, and the four polygon feature classes outline each landslide Source, Debris Trail, and Dam as well as the landslide-dammed Lake, where mapped (Figs. 1 and 2). Note that Debris Trail and Dam, as mapped here, together comprise the landslide deposit, with the Dam limited to the part of the deposit damming its respective watercourse (i.e., the Dam polygon is limited to the valley width). The table contains metadata, including quality rankings. See the “Data dictionary” section and Supplementary information for full definitions of attributes listed in Fig. 2Footnote 3.

Fig. 1
figure 1

a Overview of the 1036 landslide dam sites in v1.0 of the NZLDD. b An example of the mapped components of a landslide dam in the database (S: Source, DT: Debris Trail, D: Dam, L: Lake; dam width WDam and length LDam as indicated. c An example of two landslide dams on one stream with one multi-component dam. Two landslides (labelled A and B) contribute to the downstream dam. A1 and A2 refer to two sources contributing to the same Debris Trail. Note that the upstream landslide (south of the multi-component example) is independent, and is thus labelled the same way as the example in (b)). Each dam corresponds to a DamSite point, indexed using DamSiteID. Basemap: hillshade model generated from the 8 m national digital elevation model (LINZ 2012)

Fig. 2
figure 2

Overview of the NZLDD structure. PK, primary key; FK, foreign key (see ESRI (2016) for definitions of primary and foreign keys). Bold attributes are supplied for all dams where possible, italicised attributes for the subset only, and all other attributes are calculated using ArcPy scripts for all dams where possible in NZLDD v1.0

Datasets included

The database is a compilation of previously mapped published and unpublished landslide dams, as well as newly identified dams. The datasets span a range of event-based inventories, which include both rainfall- and earthquake-induced landslides, as well as individual case studies, compilation datasets, and composite events with unknown triggers (Table 1).

Table 1 Datasets included in the NZLDD v1.0. Numbers in last column refer to the number of dams in each inventory. Note there are duplicates amongst inventories, and some dams have been subsequently excluded from v1.0 of the database due to poor data quality

As part of creating the NZLDD, extensive new mapping was completed. This included mapping recent landslides, dams, and lakes, particularly in Southland, Fiordland (including those formed during the 2003 and 2009 Fiordland earthquakes), and Kaikōura, as well as remapping many previously mapped landslide dams (Table 1) to a higher level of accuracy and detail. To do this, all available maps and digital data on landslides were collated from earlier studies into a GIS. This included polygon, line, and point data of landslides and landslide dams (where available) captured at a range of accuracies and scales.

The following base data were used to facilitate the new mapping and remapping:

  • The national 8 m Digital Elevation Model (DEM) and associated hillshade model, created primarily from 20 m contours (LINZ 2012)

  • National c. 1.5 m resolution 2012–2014 cloud-minimised mosaics of SPOT 6 satellite imagery (Ministry for the Environment 2014)

  • The LINZ 1:50,000 topographic map and NZ-Imagery service on ArcGIS Online

  • Time-series Planet Labs 3 m resolution satellite imagery

  • Google Earth imagery

  • The LINZ 1:50,000 (Topo50) lake and river polygons (LINZ 2020a, b)

  • Landslides in the Landslide Database (Rosser et al. 2017) and GNS large landslide inventory (Hancox and Perrin unpublished)

  • QMAP 1:250,000 series landslide layers and landslide deposit geological units compiled at 1:50,000 scale (e.g. Cox and Barrell 2007; Forsyth et al. 2008; Rattenbury et al. 2010; Turnbull 2000; Turnbull et al. 2010)

  • Digital Surface Models (DSMs), hillshades and orthomosaics captured using Terrestrial Laser Scanning (TLS) and/or Unmanned Aerial Vehicles (UAVs) were used for detailed case studies (e.g. Hapuku Rock Avalanche; see Wolter et al. 2022)

  • In the Kaikōura area, the following datasets were used to locate landslide dams immediately post-earthquake that were not captured in versions 1, 2, and 3 of the Kaikōura Earthquake landslide inventory (Massey et al. 2018, 2019, 2021; Jones et al. in prep):

    1. 1.

      2015 30 cm regional aerial imagery (Environment Canterbury 2015)

    2. 2.

      2016 1 m Light Detection and Ranging (LiDAR) data and associated hillshade model (GNS Science 2022a; GNS Science et al. 2022a)

    3. 3.

      2017 1 m LiDAR data and associated hillshade model (GNS Science 2022b)

    4. 4.

      2017 30 cm regional aerial imagery (GNS Science et al. 2022b)

    5. 5.

      2 m InSAR-corrected difference model between 2015 and 2017 DSMs (Massey et al. 2020)

    6. 6.

      2019 30 cm regional aerial imagery (GNS Science 2022c)

Summary statistics

The NZLDD v1.0 geodatabase includes 1036 landslide dams throughout Aotearoa New Zealand. This is a significant increase in the number of documented dams in New Zealand and worldwide—previous global inventories include up to ~ 1800 dams worldwide (cf. Fan et al. 2020; Shen et al. 2020).

Of these 1036 dams, 60–68% occurred in terrain of Cambrian to Mesozoic (510 to 100 Ma) basement rock (usually metamorphosed argillites and sandstones—“greywacke”) and 14–18% in Neogene (23 to 3 Ma) sedimentary rock terrains (depending on whether the Kaikōura dataset is included or not; Fig. 3). Most dam-forming landslides were classified according to Hungr et al. (2014) as rock avalanches, falls, slides, and topples (51–52%), compared with 26% in Fan et al. (2020).

Fig. 3
figure 3

Source geology has a higher count (n = 569 and n = 1152, respectively) because some dam sites have more than one source mapped and the other data presented here are based on dam site, which amalgamates multiple landslide sources and/or debris trails and dams, if applicable (see Fig. 1)

Summary pie charts for landslides within v1.0 of the NZLDD. The charts on the left exclude the 2016 Kaikōura Earthquake inventory (DamSite n = 470) whereas those on the right illustrate the full v1.0 database (DamSite n = 1036).

Many dams in the NZLDD (30–62%) formed because of earthquakes (Fig. 3; Table 2). For example, 470 landslides that formed dams in the database were triggered by the 2016 Kaikōura Earthquake. Miscellaneous triggers, such as progressive weakening, fluvial erosion, snowmelt, volcanic eruption, undercutting, creep, collapse, orientation of rocks, and changing reservoir levels, total 13% (if the Kaikōura Earthquake dams are not included) or 7% (if the Kaikōura Earthquake dams are included). Rainfall-induced landslides account for just 3–5% of the dams in the NZLDD. Dams with unknown or unclassified triggers comprise 52% or 28% of the NZLDD v1.0. In contrast, according to Fan et al. (2020), ~ 20% and ~ 45% of landslide dams worldwide were triggered by earthquakes and rainfall, respectively. In their database, miscellaneous triggers (snowmelt, volcanism, human, etc.) totalled 19% and unknown triggers totalled 16%. Note that earthquake-triggered landslides are typically mapped systematically in New Zealand after a seismic event. This was certainly the case in the Kaikōura region, where the landslide inventory (Jones et al. in prep) included even small dams and used high-resolution datasets that would have been unavailable for earlier events. Hence, the database may be biased towards more recent, earthquake event inventories. We have therefore included summary statistics both with and without the Kaikōura Earthquake dataset.

Table 2 Significant landslide-triggering events with the number of dams in v1.0 of the NZLDD. Note that more dams could have formed in some events, but were not included in this version of the NZLDD. Note also that the number of dams per event is a compilation from different sources and are the total in the database from that event; therefore they do not always correspond to the number of dams shown by source in Table 1

Dam Types I and II, as defined in Fan et al. (2020) and the “Landslide dam sites” section, are the most common in the database (Fig. 3), unlike the global database, where Type III is most common. Most dams have failed completely, which aligns with international findings.

The landslides that dammed watercourses range in source volume from 4 m3 to 180 M m3, and the runout calculated aligns well with international landslide datasets (Brideau et al. 2021; Fig. 4). Some smaller landslides have a greater runout than international examples.

Fig. 4
figure 4

Landslide relief (H) over runout (R) ratio vs landslide source volume for the NZLDD dams, plotted against international inventories collated in Brideau et al. (2021) (hollow points). Dotted lines indicate the mean and ± 2 standard deviation trendlines for the international inventories. All landslide types are included in the plot. Dark grey points use the 3D H/R ratio, and light grey points use the 2D H/R ratio (see SI Table 1)

Data dictionary

Landslide dam sites

This point feature class signifies the location of landslide dams and is situated at the upstream end of the dam. There is always one DamSite point for one Dam (see below). The point position is exactly on the river centreline so that river and catchment metrics can be systematically calculated. For consistency, river centrelines were generated using the 8 m national DEM (LINZ 2012; see the “Watershed analysis methodology” section for more details). For this reason, landslide dam points (DamSite in Fig. 2) may not always be immediately adjacent to the landslide dam polygons (Dam in Fig. 2). This is particularly the case in areas of low topographic gradient, due to the coarse resolution of the national DEM, and where the river centreline might not represent the most recent watercourse, for example, due to digitising errors or channel avulsion.

The DamSite feature class contains most attributes specific to each landslide dam. A summary is presented in Supplementary Information SI Table 1. Note that we followed the definitions in Fan et al. (2020) as closely as possible. Variations from these definitions are mentioned in the table and Figs. 5 and 6.

Fig. 5
figure 5

Examples of dam types in the NZLDD, as defined by Costa and Schuster (1988). Points are DamSite locations. Coloured polygons indicate landslide Source, Debris Trail, and Dam, and landslide-dammed Lake (see Figs. 1 and 2). Blue arrows indicate downstream direction on dammed watercourse. Hillshade elevation models were generated with sun angle from the northwest (see “Datasets included” section for hillshade sources)

Fig. 6
figure 6

Dam breach types identified and applied in the NZLDD

Watershed analysis methodology

The national DEM of New Zealand (LINZ 2012) was used for catchment analysis, which has a grid resolution of 8 m.

The following methodology was used to generate river centrelines and define catchment areas for each landslide dam using ArcGIS:

  1. 1.

    The national 8 m DEM was ‘filled’ to remove artificial ‘sinks’ or depressions—the assumption was made that all sinks are artefacts.

  2. 2.

    Flow direction and flow accumulation grids were created to establish the flow path of streams and rivers throughout Aotearoa New Zealand.

  3. 3.

    The flow accumulation raster was reclassified so that only values > 5000Footnote 4 showed. These are the river centrelines used for the NZLDD.

  4. 4.

    The landslide DamSite points were snapped to the river centrelines and then used as ‘pour points’, defined as points on the surface where water exits the catchment (ESRI 2018). Each of these pour points was assigned a unique identifier that could be matched to its corresponding catchment. The ArcGIS Snap Pour Point tool moves each point to the highest flow accumulation, so a search radius of 4 m (corresponding to half the resolution of the DEM) was used to make sure that a point is snapped to the nearest river centreline.

  5. 5.

    The catchment upstream of each pour point, or dam, was then delineated using the Watershed tool. Each point was calculated individually to make sure that the full catchment upstream of each dam point was captured, as the tool delineates only the catchment area up to the next upstream pour point. This process was scripted so that each dam catchment was generated separately, converted to a polygon, and incorporated into a single dataset of catchment polygons. Maximum and minimum elevation for each dam catchment were also calculated.

  6. 6.

    The area of each polygon, which represents the upstream catchment area for each dam, was joined back to the original landslide DamSite point using the unique identifier.

Landslide sources

This polygon feature class signifies the source area(s) of the landslide that formed the landslide dam. The Source was delineated using a variety of methods from manual mapping on 1 m resolution LiDAR to unsupervised techniques using remote sensing. The quality of data (e.g. resolution and accuracy) therefore varies depending on the technique used, as detailed in the Quality Rankings table (see the “Quality rankings” section).

This feature class contains attributes that are specific to the landslide source. A summary is presented in SI Table 2.

Landslide debris trails

This polygon feature class signifies the debris trail(s) of the landslide, defined as the deposit between the Source and Dam, and it can also include debris that did not block the watercourse (see below). Debris trails were not mapped if the deposit extent matched the Dam extent. Like the landslide Source, the Debris Trail was delineated using a variety of methods from manual mapping on 1 m resolution LiDAR to unsupervised techniques using remote sensing. The quality of data (e.g. resolution and accuracy) therefore varies depending on the technique used, as described in the Quality Rankings table (see the “Quality rankings” section).

This feature class contains attributes that are specific to the landslide debris trail. A summary is presented in SI Table 3.

Landslide dams

This polygon feature class signifies the area of landslide debris that blocked the watercourse, forming the landslide dam. Where possible, this includes only the debris that formed the dam while other debris that did not block the watercourse has been mapped separately into the Debris Trail feature class. Occasionally, the Dam polygon may be representative of the whole landslide deposit. In some instances, the dam no longer exists in the landscape and its extent has been interpreted. Like the landslide Source, the Dam was delineated using a variety of methods from manual mapping on 1 m resolution LiDAR to unsupervised techniques using remote sensing. The quality of data (e.g. resolution and accuracy) therefore varies depending on the technique used, as described in the Quality Rankings table (see the “Quality rankings” section).

This feature class contains attributes that are specific to the landslide dam. A summary is presented in SI Table 4.

Landslide-dammed lakes

This polygon feature class signifies the known or inferred area inundated by landslide-dammed lakes. In many instances, the lake no longer exists in the landscape and its extent was inferred based on geomorphic features such as sedimentary plains, wetlands, and deltas. If no obvious signs of a past lake exist for a particular case study, a lake was not mapped. Where the landslide-dammed lakes still exist today (typically for large, enduring dams), the current lake extent was taken from the LINZ topographic 1:50,000 scale mapping (LINZ 2020a, b). These are likely not the initial or maximum lake extents, particularly for old dams. Using the datasets available, we were able to map lakes for ~ 80% of landslide dams in the v1.0 dataset. The lake was delineated using a variety of methods from manual mapping on 1 m resolution LiDAR to using the LINZ 1:50,000 scale map data. The quality of data (e.g. resolution and accuracy) therefore varies depending on the technique used, as described in the Quality Rankings table (see the “Quality rankings” section).

This feature class contains attributes that are specific to the landslide-dammed lake. A summary is presented in SI Table 5.

Quality rankings

This table records the quality of each source dataset or reference. Where multiple references are given in the feature classes, the ranking of the highest overall quality record is assigned to the dataset. For example, if the dataset was originally mapped by Perrin and Hancox (unpublished) but updated by the NZLDD v1.0 authors using more recent satellite imagery or terrain models, then the metadata record for the updated mapping is linked to the landslide DamSite points via the Metadata ID field. Only the highest quality records (i.e. those linked to the DamSite point by the MetadataID) have quality rankings provided for v1.0 of the database.

In v1.0 of the database, this table contains attributes that are specific to the quality of each whole source dataset, not individual landslide dams. In v2.0, the quality rankings will be landslide-specific. A summary is presented in SI Table 6.