Offshore application of landslide susceptibility mapping using gradient-boosted decision trees: a Gulf of Mexico case study

Dyer, Alec S.; Mark-Moser, MacKenzie; Duran, Rodrigo; Bauer, Jennifer R.

doi:10.1007/s11069-024-06492-6

Offshore application of landslide susceptibility mapping using gradient-boosted decision trees: a Gulf of Mexico case study

Original Paper
Open access
Published: 24 February 2024

Volume 120, pages 6223–6244, (2024)
Cite this article

Download PDF

You have full access to this open access article

Natural Hazards Aims and scope Submit manuscript

Offshore application of landslide susceptibility mapping using gradient-boosted decision trees: a Gulf of Mexico case study

Download PDF

859 Accesses
Explore all metrics

Abstract

Among natural hazards occurring offshore, submarine landslides pose a significant risk to offshore infrastructure installations attached to the seafloor. With the offshore being important for current and future energy production, there is a need to anticipate where future landslide events are likely to occur to support planning and development projects. Using the northern Gulf of Mexico (GoM) as a case study, this paper performs Landslide Susceptibility Mapping (LSM) using a gradient-boosted decision tree (GBDT) model to characterize the spatial patterns of submarine landslide probability over the United States Exclusive Economic Zone (EEZ) where water depths are greater than 120 m. With known spatial extents of historic submarine landslides and a Geographic Information System (GIS) database of known topographical, geomorphological, geological, and geochemical factors, the resulting model was capable of accurately forecasting potential locations of sediment instability. Results of a permutation modelling approach indicated that LSM accuracy is sensitive to the number of unique training locations with model accuracy becoming more stable as the number of training regions was increased. The influence that each input feature had on predicting landslide susceptibility was evaluated using the SHapely Additive exPlanations (SHAP) feature attribution method. Areas of high and very high susceptibility were associated with steep terrain including salt basins and escarpments. This case study serves as an initial assessment of the machine learning (ML) capabilities for producing accurate submarine landslide susceptibility maps given the current state of available natural hazard-related datasets and conveys both successes and limitations.

Binary logistic regression versus stochastic gradient boosted decision trees in assessing landslide susceptibility for multiple-occurring landslide events: application to the 2009 storm event in Messina (Sicily, southern Italy)

Article 01 August 2015

Game-theoretic optimization of landslide susceptibility mapping: a comparative study between Bayesian-optimized basic neural network and new generation neural network models

Article 09 April 2024

Landslide susceptibility mapping using GIS-based statistical and machine learning modeling in the city of Sidi Abdellah, Northern Algeria

Article 21 December 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Submarine landslides are a significant natural hazard known to occur widely throughout the ocean seafloor. Historically, the movement of sediment downslope has damaged seabed infrastructure, including the destruction of a Taylor Energy platform in 2004, which caused the release of up to 700 barrels of oil per day for over a decade in the Gulf of Mexico (GoM) and cost approximately $500 million to decommission (Kaiser et al. 2009; Casey 2019). Such damages not only result in costly repairs or replacements but may slow down petroleum transportation to onshore facilities and introduce a potentially devastating marine and coastal environment stressor. These risks pose a threat to existing infrastructure for petroleum production as well as to future infrastructure installations, such as those for carbon storage and wind farms (Offshore Energy 2018). Recently, the impact of large mass movements in the deep ocean that cause tsunamis has been examined due to the significant threat to shoreline communities and economies, such as the devastating 1998 Papua New Guinea tsunami event that caused over 1,600 fatalities (Vanneste et al. 2013; Pampell‐Manis et al. 2016; Sawyer et al. 2019). It is imperative to understand the potential of landslide occurrence in offshore regions to support planning strategies for offshore structure placement, reduce the chance of catastrophic incidents, and protect human, environmental, and economic safety.

The term landslide has become a common label used to represent the types of mass-transport deposits for terrestrial and submarine environments, causing issues in terminology and implications among fields of study related to sediment movement (Shanmugam and Wang 2015). In submarine environments, types of sediment instabilities can be classified into rockfalls, slides or slumps, flows, and turbidity flows (Brunsden and Prior 1984). However, it is difficult to differentiate between these various types of mass movement in the submarine environment. Therefore, this study refers to all types of submarine slope failures as landslides. Landslides are initiated when driving forces exceed the resistance forces of the material composing a surface (Anderson and Anderson 2010). Conditions for crossing the force-balance threshold in submarine landslides may be met by a variety of triggers, including gas migration, wave forcing, and earthquakes, which can inform modelling techniques that assess the probability of a landslide occurrence.

Landslide susceptibility mapping (LSM) is a quantitative method that enables statistical or machine learning (ML) models to calculate the probability of a landslide occurring at a given location based on various factors relating to landslide events, thereby characterizing the spatial patterns of underlying landslide mechanisms (Reichenbach et al. 2018). This LSM framework is static, with the assumption that the environmental conditions at the time of future landslide events for a given area will be similar to the conditions of previous landslide events in that same area. Previous terrestrial studies have used a variety of predictive models to perform LSM. Methods include random forests (Micheletti et al. 2014), generalized additive models (Chen et al. 2017), ensemble decision trees (Sahin 2020), and deep neural networks (Shahri et al. 2019; Wang et al. 2020). These methods are used in conjunction with a Geospatial Information System (GIS) to assimilate a spatially continuous prediction map. While most of the LSM studies have been conducted on terrestrial systems, little is known about the capability of LSM in regards to submarine landslides (Reichenbach et al. 2018) as these methods have seldomly been applied to submarine environments (Shan et al. 2021). Similar submarine slope instability assessments have been completed (e.g., Hitchcock et al. 2010; Collico et al. 2020; Obelcz et al. 2020), however, there is a need to further understand the submarine application of LSM at larger spatial scales.

This paper presents the application of LSM to an offshore region, using the northern GoM as a case study. With currently available spatial data for factors that relate to the occurrence of submarine landslides in the study region, a gradient-boosted decision tree (GBDT) is used for supervised ML to assess the accuracy at which geospatial LSMs can perform in this region, using logistic regression (LR) as a baseline model. Beyond model performance, a feature attribution analysis was performed using SHapely Additive exPlanations (SHAP) to provide insight into what variables are most influential for assessing submarine landslide potential. This LSM application to the northern GoM serves as a first attempt to map landslide potential at a large scale in a remote, offshore region with an assessment of the capabilities using currently available data and how future studies can be improved.

2 Factors relating to submarine landslide potential

Several key factors influence the potential of a submarine landslide occurrence. These factors can be heterogeneous over space and time, and factors in one region may not have the same influence in a different region. Submarine landslide factors that initiate slope failure may be related to those found in terrestrial LSM applications; however, others are more specific to submarine LSM. Categories of factors for submarine LSM can include topographical (McAdoo et al. 2000; Shahri et al. 2019), geological (Cooper and Hart 2002; Martin and Bouma 1982; Tripsanas et al. 2004; Maloney et al. 2020; Masson et al. 2006), geomorphical (McAdoo et al. 2000; Sassen et al. 1999; Milkov & Sassen 2000), and geochemical (Maloney et al. 2020; Feseker et al. 2014; Cooper and Hart 2002). Together, these factors act as a proxy for submarine conditions susceptible to landslide initiation and can be used to perform submarine LSM. A full description of these factors is reported in Online Resource 1.

3 Case study: Gulf of Mexico

Since the 1950s, submarine landslides in the northern GoM have been studied with an interest in protecting offshore structures for petroleum production (i.e., Shepard 1955; Coleman et al. 1978). With offshore energy infrastructure initially placed in shallow, nearshore waters off the GoM coast, early submarine landslide studies focused on the Mississippi River Delta Front (MRDF; i.e., Coleman et al. 1978, 1980). Since then, offshore projects have explored deeper waters to access additional resources with deepwater (water depth > 1,000 feet) and ultra-deepwater (water depth > 5,000 feet) activities (Bureau of Ocean and Energy Management 2008). With ongoing dependence upon marine infrastructure to support energy production, reliance on marine economies, and the likely advent of carbon storage projects, there is a need to map the potential of submarine landslides in remote areas of the seafloor. Currently, no basin-scale LSM applications for the northern GoM have been published. This is likely due to limited studies and data availability. However, with recent advances in sensor and seismic technology, open-sourced high-resolution bathymetry and spatial seafloor hazard data have become available, and there is an opportunity to perform LSM for offshore regions.

3.1 Study area

The case study area includes a portion of the northern GoM, parts of which are hot spots for petroleum production in the United States (U.S.) Exclusive Economic Zone (EEZ). The specific boundary of the study area is the U.S. EEZ in the GoM where the water depth is greater than 120 m (Fig. 1). This region extends out to 200 nautical miles offshore and covers a total area of 386,753 km². Notable regions within the study area include the Texas (TX)-Louisiana (LA) Slope, the Sigsbee Escarpment, the Mississippi Canyon, the De Soto Canyon, and the Florida Escarpment. The GoM basin is the result of an extinct extensional regime that split the previously emplaced Louann salt sheet into the northern GoM salt basin and the Campeche salt basin (Galloway 2008). The Louann salt sheet forms the continental slope of the northern GoM before the Sigsbee Escarpment descends into the abyssal plain. Recent (late Neogene-present) sedimentation rates are greatest along the central GoM coast margin, with the heaviest sedimentation supplied by the Atchafalaya and Mississippi Rivers. The resulting progradation supplies massive amounts of sediments to the GoM basin that are concentrated in the Mississippi Canyon and then redistributed through geomorphologic processes along the continental shelf and onto the abyssal plain. Due to these high sedimentation rates, the seabed sustains a significant load that affects its stability. Further, the Louann salt sheet deforms and migrates continuously due to heavy clastic sedimentation rates that were initiated in the Late Jurassic-Early Cretaceous and continue to load the salt sheet and drive gravity tectonics in modern times (Galloway 2008). Salt diapirism is responsible for the many salt-withdrawal mini-basins that introduce high variability to the bathymetry, the differential accumulations of sediment, and much of the subsurface to seafloor structural complexity (i.e., faults and fractures). Dense structural complexity indicates a greater potential for fluid migration that utilizes the high permeability provided by fractures to relieve fluid or gas in pressurized subsurface reservoirs. Where these fluid migration pathways breach the seafloor is indicated by seeps, mud volcanoes, hydrates, and chemosynthetic communities.

Four regions where historic landslides have been digitized were used for analysis (Dyer et al. 2022). Each region is labeled A-D from west to east across the study area, as shown in Fig. 1. Region A has an area of 8,508 km² and is located within the TX-LA Slope salt basin. Region B has an area of 3,575 km² and is also located within the TX-LA Slope salt basin. Region C has an area of 17,577 km² and primarily covers a portion of the northern GoM salt basin and the western flank of the Mississippi Canyon. Region D has an area of 5,389 km² and is located southwest of the De Soto Canyon.

3.2 Materials and methods

3.2.1 GIS feature database

A spatial database was curated containing 20 features relating to factors that are known to affect submarine landslide potential in the northern GoM. Topographic features include elevation, aspect, slope, and curvature (IOC et al. 2003). Geomorphology features include basins, canyons, and escarpments (Harris et al. 2014) as well as channels (Bureau of Ocean & Energy Management 2016a). Geological factors include faults (Diegel et al. 1995; United States Geological Survey 2004a), salt diapirism (United States Geological Survey 2004b), sediment thickness (Twichell et al. 1995), and sediment type (Buczkowski et al. 2020). Lastly, geochemical factors include gas presence, pockmarks, mud volcanoes, and seeps (Bureau of Ocean & Energy Management 2016a) as well as hydrates (Twichell et al. 1996; Majumdar et al. 2017). All spatial features that are represented as points, lines, or polygons were converted to a continuous surface by using the Euclidean Distance tool provided by ArcGIS Pro (Environmental Systems Research Institute 2022). Derivatives of elevation including aspect, slope, and curvature were created using the Surface Parameters tool in ArcGIS with a 3,000-m search radius (Environmental Systems Research Institute 2022). The sediment type data acquired from the usSEABED database (Schweitzer et al. 2020) was divided into features depicting the percent coverage of sand, mud, gravel, and rock. All rasters are sampled to a 500-m spatial resolution. Source information and maps of each feature can be found in Table S1 and Fig. S1, respectively (Online Resource 1).

It is important to note that even though ocean surface waves can trigger landslide events in water depths less than approximately 120 m (Henkel 1970; Maloney et al. 2020), this factor is not included in this study. Therefore, analysis for this case study was limited to water depths greater than 120 m in the U.S. EEZ. Furthermore, a visual inspection of the regional sediment accumulation rate predicted by Restreppo et al. (2020) in Fig. S2 (Online Resource 1) shows that there is little variation in the sediment accumulation rate for the study region. For that reason, sediment accumulation rate is not included in this case study.

The historic submarine landslide inventory utilized in this study as observational data was curated by Dyer et al. (2022). This dataset represents boundaries of where sediment volume was lost (also referred to as the depletion area) of historic submarine landslides in four regions within the study area (Fig. 2), offering a dataset with minimal false negatives (i.e., non-identified landslides). These landslide observations were added to the GIS database by rasterizing to the same grid as the input features. Region 1 had 34,049 observations with 1,674 (4.92%) of them having a positive landslide class. Region 2 had 14,161 observations with 892 (6.3%) of them having a positive landslide class. Region 3 had 70,301 observations with 4,342 (6.18%) of them having a positive landslide class. Lastly, region 4 had 21,527 observations with 714 (3.32%) of them having a positive landslide class.

3.2.2 Mutual information

To assess the statistical relationships between the landslide and non-landslide groups, Mutual Information (MI) is performed on each input feature with the landslide class as the target variable. MI is a grounded concept in information theory that is used to estimate the dependency or relatedness between two discrete groups (Ross 2014). Here, we measured the information content between two continuous groups (landslide and non-landslide) by performing an entropy estimation using a nearest-neighbors method designed by Kraskov et al. (2004) and Ross (2014). By calculating the MI of each input feature between the landslide and non-landslide classes, the value that each feature may have in discerning between the two classes was estimated. MI values closer to zero indicate that the feature is independent of the two landslide classes, and values closer to one indicate that the feature has a high dependency between the two classes. Therefore, features with higher MI values will likely provide more valuable information for classification modelling. This functionality is provided by the scikit-learn package (Pedregosa et al. 2011).

3.2.3 Collinearity

A Pearson correlation matrix was created between all the input variables to assess the amount of collinearity between the input features. Values range from −1 to 1. Here, it was assumed that any two variables with a correlation equal to or greater than 0.8 as well as equal to or lower than -0.8 are highly correlated and it is unwise to include both features in the ML models. Between two highly correlated features, the feature with the lowest MI score was dropped from the analysis.

3.2.4 Frequency analysis

Frequency analysis was completed for each of the remaining features to provide statistical results that may indicate higher or lower landslide susceptible groups of feature values. For each feature, the continuous values were split up into a set of predefined bins. For each bin, the total and percent of pixel values within that range bin were given, as well as for the total and percent of positive landslide classes that fall within that range bin. The frequency analysis is provided in Table S2 (Online Resource 1).

3.2.5 Predictive modelling

This study utilized a GBDT model to predict the binary target (landslide or non-landslide) and forecast landslide susceptibility. This type of ML model is an ensemble method that combines several weak prediction models (decision trees) to create a highly accurate model, known as boosting. More specifically, gradient boosting methods minimize the loss function for each base learner sequentially, ensuring that subsequent learners are always more effective. As a result, this boosting method reduces overfitting and improves overall performance (Friedman 2002). Also, GBDTs are robust to outliers and can model non-linear feature interactions (Friedman 2002; Elith et al. 2008). With these defining characteristics, GBDTs have gained wide acceptance for both regression and classification applications, and these models have been shown to perform with high accuracy in LSM studies (Song et al. 2018; Sahin 2020). Here, the eXtreme Gradient Boosting (XGBoost) algorithm was utilized as the GBDT model, provided by T. Chen and Guestrin (2016). XGBoost is a boosting algorithm that can handle large datasets and provides a parallelization feature to improve computational speed (T. Chen and Guestrin 2016). Additionally, LR has been utilized for LSM (Raja et al. 2017) and is reported as a baseline using the scikit-learn package (Pedregosa et al. 2011).

To assess the predictive accuracy of a classifier and its ability to distinguish between landslide and non-landslide observations, a permutation method was used. This approach performs training and testing on three ratios of training–testing sets: 1:1 (one training region, one testing region), 2:1 (two training regions, one testing region), and 3:1 (three training regions, one testing region). For each ratio, all permutations of the possible arrangements for training and testing sets using the four training–testing regions were completed. This approach measures how the number of unique training regions affects accuracy and offers an assessment of all four training–testing regions.

The following ML workflow (Fig. S3, Online Resource 1) was performed on each permutation of training and testing sets. A randomized search method was used to tune several parameters of each model (Fig. S3, Fig. S4, Online Resource 1), which executes 10 parameter permutations and evaluates model performance using stratified k-fold cross-validation (CV) on the training set. With the tuned parameters, an optimized model pipeline was fitted to the training set. For the GBDT model, the testing set was used as early stopping criteria during the optimized model phase with training ceased once the Area Under the Receiver Operating Characteristic Curve (ROC AUC, but hereby referenced as AUC) score did not increase after 10 training iterations (epochs). Early stopping can be a critical part of GBDT models due to the potential of overfitting to the training set.

Prior to model fitting throughout the ML workflow, the training and testing sets were sent through a data processing pipeline. First, all input features were scaled from 0 to 1. Next, all missing values were filled using k-nearest neighbor (KNN) imputation with five neighbors (Cover and Hart 1967; Triguero et al. 2019). Lastly, due to an imbalance in the two target classes, under-sampling was performed on the training set to reduce the number of non-landslide observations until equal to the number of landslide observations.

Model performance was evaluated with the testing set of each permutation using standard metrics for a binary classification problem including accuracy, precision, recall, and AUC. An accuracy score measures the proportion of correctly classified observations. A precision score measures the proportion of positive predictions (landslide) that were positive. A recall score measures the proportion of the positive observations that were correctly classified. The AUC score compares the true positive rate (TPR) and the false positive rate (FPR) at various cut-off thresholds and measures the ability of a classifier to distinguish between two classes. For all metrics, values range from 0 to 1, where lower values indicate poor performance and higher values indicate good performance. Additionally, Receiver operating characteristic (ROC) curves were provided for the training and testing sets. ROC curves compare the FPR to the TPR and illustrate the performance of a binary classifier as the cut-off threshold is adjusted.

3.2.6 Feature attribution

SHapely Additive exPlanations (SHAP) (Lundberg and Lee 2017) was used to assess the importance that each feature has in landslide classification. The SHAP method is an application of game theory and can be used to estimate Shapely values (Shapley 1997) which provides consistent and accurate feature attribution values. TreeSHAP, a SHAP method designed for tree-based models, was utilized to estimate SHAP values and enables fast computation speeds (Lundberg et al. 2018). Here, SHAP values reported are the absolute weighted average of Shapely values for each tree in a GBDT model and represent the contribution that a feature has in determining the outcome (prediction) of the model. SHAP values are estimated using the testing set for each permutation, allowing for the assessment of the distribution of feature attribution over the four training–testing regions.

3.2.7 Landslide susceptibility mapping

A landslide susceptibility map was produced for the entire study area to visualize the spatial patterns of submarine landslide susceptibility. The final map was created by training each model with the input data from all four training–testing regions and then predicting the probability of the landslide class from 0 to 1 for each pixel over the study area. Since the landslide observations from the dataset by Dyer et al. (2022) represent a variety of types, including slumps and flows, the predictions represent the potential for the mass transport of sediment that is inclusive of all types of transport mechanisms. Additionally, because the landslide boundaries represent the depletion areas, the landslide susceptibility maps will be applicable to predicting potential areas for landslide initiation and source sediment. The Jenks Natural Breaks Classification method was used to classify the landslide susceptibility prediction into risk classes representing very low, low, medium, high, and very high (Jenks 1967), which has been utilized to visually represent the degree of landslide risk (Chen et al. 2017; Song et al. 2018; Shahri et al. 2019; Wang et al. 2020). A cumulative density function (CDF) plot was additionally supplied for each map to provide context for the distribution of values within each risk class.

4 Results

4.1 Mutual information

The MI results representing the ability of a feature to discern between the landslide and non-landslide groups, and therefore a stronger predictor of landslide occurrence, is shown in Table 1. Sediment types of mud, sand, gravel, and rock showed the highest dependency between the two groups. Among the topographical features, elevation, curvature, and aspect had the lowest MI scores, while slope showed a higher amount of dependency. Among the features with high MI scores included the geomorphological factors: basins, canyons, channels, and escarpments. All the geochemical features received relatively low MI scores.

Table 1 Mutual information results for each future

Full size table

4.2 Collinearity

The highest correlation at −0.8 occurred between two sediment types: mud and gravel. Of the two features, mud had the higher MI value, and therefore gravel was dropped from further analysis. Other considerable high collinearities occurred between mud volcanoes and canyons with a correlation of 0.79, as well as hydrates and canyons with a correlation of 0.71. However, these values did not meet or succeed ± 0.8 and were therefore retained for analysis. The full Pearson correlation matrix can be found in Fig. S4 (Online Resource 1).

4.3 Model evaluation

The ability of the LR and GBDT models to classify landslide presence given the remaining 19 input features was assessed using the permutation method that performs on all combinations of three training–testing set ratios (1:1, 2:1, and 3:1). The 1:1 ratio had 12 combinations with the number of training observations ranging from 14,161 to 70,301. The 2:1 ratio also had 12 combinations with the number of training observations ranging from 35,688 to 104,350. The 3:1 ratio had 4 possible combinations and the number of training observations ranged from 69,737 to 125,877. The averaged results for all evaluation metrics over each training–testing ratio are reported in Table S5 (Online Resource 1).

A comparison of the LR and GBDT models' evaluation metrics scores on the testing set during the permutation method is shown in Fig. 3, which offers a visual to discern between performance differences based on the individual models as well as the number of training regions. The median accuracy score for both models shows a decreasing trend as the training set size increases but shows an increasing trend for precision, recall, and AUC. Precision scores were low (< 0.5), indicating that the models tend to predict an observation as positive (landslide) when it is negative (non-landslide). Recall scores had high variability but did increase on average as more regions were used for training. Both the LR and GBDT models seldom have AUC scores under 0.50, indicating that the models generally perform better than random guessing. The GBDT model in the 3:1 permutation achieved the highest AUC score compared to all other permutations with an average score of 0.81. Overall, the GBDT model outperformed LR, with median AUC values 29.6%, 14.0%, and 7.2% higher for the GBDT model in the 1:1, 2:1, and 3:1 training–testing ratio groups, respectively.

The ROC curves (Fig. 4) illustrate that the models generally performed better with the addition of training data from various regions in the study area. Model overfitting, which is the difference between the training and testing set metrics, decreased sequentially in training–testing ratio groups along with increased average AUC. Overfitting was especially minimal for the GBDT model. Therefore, these findings indicate an instability of model performance when training on a singular region of the study area with variable metrics on unseen data.

4.4 Landslide susceptibility predictors

Figure 5 shows the distribution of mean SHAP values over each permutation within the 1:1, 2:1, and 3:1 training–testing ratio groups. Here, feature attribution is estimated using TreeSHAP (Lundberg et al. 2018) with the mean absolute SHAP value representing the contribution that a feature has in determining the model outcome.

The ranking of top predictor features varied in each training–testing permutation group, but among the top 5 predictors for each group were features related to topography, geomorphology, gas migration, and sediment characteristics. Only 7 features had a median SHAP value greater than zero in the 3:1 permutation group, including slope, seeps, sediment type—rock, pockmarks, gas, faults, and escarpments. This coincides with a smaller distribution of AUC scores for the 3:1 group, suggesting that the GBDT model can perform more consistently with additional training regions and a minimal number of features.

Slope had the highest median contribution compared to all the other features for the three training–testing groups. Based on the MI results, it would be expected that the topographic features, aspect, curvature, and elevation, have minimal influence on landslide classification, and this holds true in the SHAP results with the exception of curvature and elevation obtaining considerably high SHAP values in the 1:1 and 2:1 ratio groups.

The importance of each geomorphological feature fluctuated depending on the training region(s) because the type of seafloor geomorphologies within each region varies. All the geomorphology factors, basins, canyons, channels, and escarpments, have a high influence on landslide classification in the 1:1 and 2:1 ratio groups, but only the distance to escarpments feature shows a high level of importance in the 3:1 ratio group.

Of the three features used in modelling relating to the percent of sediment type classified as mud, rock, and sand, only rock showed to have a major influence on landslide classification. The percentage of mud and sand coverage on the seafloor had a minimal contribution to the GBDT model based on SHAP. Other geological features, faults and sediment thickness, had a positive influence on model performance with the exception of salt diapirs which had very low SHAP values over all three permutation groups.

Among the geochemical features, gas, mud volcanoes, and seeps provided positive contributions to the GBDT model predictions. These features can indicate the presence of gas migration, which can be a trigger for slope failure initiation. It can be expected that lower distances to these seafloor geohazards will increase the probability of a submarine landslide occurrence. However, distance to pockmarks and hydrates had low importance scores over each permutation. The low importance of the pockmarks feature may indicate that gas migration is a stronger predictor for landslide susceptibility than fluid migration. Data collected for hydrates (Twichell et al. 1996; Majumdar et al. 2017) has a minimal coverage over the study region and is shown in Fig. S1 (Online Resource 1), so the information available to the model on hydrate location relative to landslide observations may not allow for a strong relationship to be identified.

4.5 Landslide susceptibility mapping

Figure 6 shows landslide susceptibility maps for the full study area predicted by the LR and GBDT models, using the Jenks Natural Breaks Classification method to classify landslide susceptibility risk into five classes of very low, low, medium, high, and very high (Jenks 1967).

Similar landslide susceptibility patterns are observed between the LR and GBDT models, but slight differences can be identified. The LR susceptibility map only classified the northern portion of the Florida escarpment as having a very high landslide potential, whereas the GBDT susceptibility map depicts the entire escarpment area as being at a very high risk. Based on the CDF plots, the LR model predicted a slightly higher percentage of the area classified in the high and very high-risk bins compared to the GBDT model. Furthermore, based on a visual assessment, the GBDT model was more capable of distinguishing between high and low landslide risk classes at smaller scales.

The CDFs provided in Fig. 6 show that more than 80% of the study area falls into a low or very low landslide risk, which is visible in the Mississippi Canyon, Mississippi Fan, and Florida Shelf. We expect high sediment movement in Mississippi Canyon, but sediment migration does not qualify under the submarine landslide definition outlined for this study. Other minimally susceptible areas tend to occur between salt basins on the TX-LA slope. In contrast, a small percentage of the total area (~ 15%) was predicted to have a high or very high landslide susceptibility. Based on the visualization of landslide potential in Fig. 6, a majority of the high and very high landslide risk classes occur in the TX-LA slope, Sigsbee Escarpment, and Florida Escarpment at the locations where the slope is the steepest (Fig. S5, Online Resource 1).

It should be acknowledged that the environments of the Florida Escarpment and particularly the Florida Shelf are geologically unique within the study area, due to the carbonate platform that composes the continental shelf in the eastern GoM. The continental shelf along the northwestern and north-central GoM is characterized by heavy clastic sedimentation. Further offshore in deepwater regions, salt withdrawal mini-basins that result from salt diapirism are consequently filled with clastic sedimentary deposits (Galloway 2008). A key process differentiating the morphology of the Florida Escarpment from other regions in the GoM is erosion via dissolution and cliff collapse, whereas in areas of clastic sedimentation, erosion is generally instigated solely through mechanical processes rather than chemical and mechanical processes. With the training-testing regions occurring mainly within the northern GoM salt mini-basin region and the DeSoto Canyon, the LR and GBDT models are extrapolating to the unseen Florida region in the landslide susceptibility maps. This may introduce inaccurate results, as a limitation of these types of ML models is the inability to extrapolate to new data.

5 Discussion

Given the availability of high-resolution bathymetry and geohazard datasets, a spatial database of topographical, geomorphological, geological, and geochemical seafloor characteristics and historic landslide areas (Dyer et al. 2022) was created that integrates many known triggers and conducive conditions of submarine landslides. These data make the northern GoM a viable region for an adapted offshore LSM application; thus, a case study was conducted to evaluate the potential of submarine landslide events in the deepwater northern GoM. Favorable results from the ML approach can be attributed to the comprehensive literature review that identified the appropriate submarine landslide factors for the study region, allowing for accurate LSM to be performed. Further context is provided into successes as well as limiting factors worth further discussion.

A LR model was utilized as a baseline for the predictive models, and, as expected, the GBDT model outperformed the LR model with higher median AUC scores for each of the three training–testing ratio groups. This enhanced landslide classification may be due to the ability of GBDTs to model non-linear interactions between input features and reduce overfitting (Friedman 2002; Elith et al. 2008). While the GBDT model showed consistent AUC scores when classifying between historic landslide scars and undisturbed areas over each permutation of training–testing ratio groups, overfitting of the model varied with minimal overfitting when training on three different regions in the study area. This can be expected when performing LSM with supervised ML, as ML models are complex algorithms, and changing model parameters and/or training data can lead to varying results (Goetz et al. 2015). These results indicate that LSM model predictions will be more stable and accurate with training data from a variety of environmental conditions.

The overall performance results were satisfactory when compared to other LSM studies using boosting algorithms; however, some studies have achieved higher AUC scores at 0.93 (Micheletti et al. 2014) and 0.98 (Song et al. 2018). Precision and recall were relatively low for our models, indicating a reasonable number of false positives and false negatives in model predictions, respectively. However, since LSM models should attempt to minimize incorrect detections of a landslide (i.e., false negatives), it should be attempted to optimize model sensitivity (i.e., recall). This can be achieved by decreasing the cut-off threshold for the probability of a positive landslide prediction. Thereby, while overall performance may not be exceptional, a suitable cut-off threshold could provide a LSM model that is optimized to correctly identify locations with a high landslide susceptibility. Furthermore, the performance of the models in this case study is likely limited by the data, as ML model performance is limited by the quality and quantity of data supplied (Fabbri et al. 2003; Raja et al. 2017). This study was possible with the emergence of submarine geophysical survey technology capable of accurately mapping seafloor geohazards that are related to landslide initiation. While it is assumed in this study that all geohazards are correctly mapped in the geohazards dataset (Bureau of Ocean & Energy Management 2016a), it is possible that the geohazards in the GOM are not fully accounted for. Therefore, the results of this analysis are reliant on the accuracy and precision of these geohazard mapping technologies and the resolution of data to support them.

The 19 input features were shown to have varying levels of importance in modelling landslide potential dependent on the regions used in each training–testing permutation. The influence that each feature has on modelling landslide potential is location-specific, as the spatial distribution of certain geohazards varies. It is our understanding that the entire study region was mapped for geological formations (i.e., pockmarks, basins, etc.), so locations with minimal geological hazards would be considered to have a low landslide potential in the absence of high-risk topographic features. Slope is a major influencing feature with most steep slopes having a high or very high landslide risk classification. Results by McAdoo et al. (2000) suggest that in the MRDF region of the GoM slide events are more prone to occur on shallower slopes that are unconsolidated, however, the produced landslide susceptibility maps for this study region, which does not overlap with the MRDF region (Fig. 6), illustrate that areas with a higher slope are more susceptible to landslides. Other notable features with a large influence on landslide susceptibility in the study region include features related to subsurface gas migration, sediment type or lithology, and high-sloped geomorphologies. Results found by McAdoo et al. (2000) in the deepwater GoM identified major slope failures occurring at high-sloped geomorphologies between salt withdrawal basins as well as along the Sigsbee Escarpment, a steep and significant bathymetric feature formed at the southern termination of the Louann salt sheet, which confirm the importance of the basins and escarpment features in permutation subsets where those seafloor geomorphologies are present. Furthermore, examining the presence of rock versus the sediment types in these analyses, sand and mud, provides additional insights. The usSEABED dataset (Schweitzer et al. 2020) reports that the rock classification conveys both loose rock, which is coarser than cobble (-8 phi), and bedrock. If the majority of the substrate contains loose rock, based on rheologic principles it would be expected to have a lower shear strength, while a higher shear strength would be expected for bedrock. This combined classification of loose rock and bedrock therefore conveys a large range of rheologic properties to the models, which may account for the significant interquartile range shown in Fig. 6. According to the frequency analysis (Table S2, Online Resource 1), 78% of the study area has a low percentage (0–25%) of rock in the substrate and only 14% has a high percentage (75%-100%). An examination of core data (Lamont-Doherty Core Repository 1977) from the study regions where usSEABED (Buczkowski et al. 2020) indicates a loose rock-bedrock composition suggests that at least the approximately upper 1 m of substrate is dominated by lutite, or fine-grained clays and muds to claystone and mudstone. Based on these results, it is plausible that sediment which is overlaid on the loose rock-bedrock classified areas may be more prone to slope instability because these overlain, fine-grained sediments are less competent than bedrock.

The models were successful, showing that it is possible to use ML to accurately forecast submarine landslide susceptibility. However, there are limitations that need to be addressed to improve model performance when applied to landslide susceptibility mapping. This study concluded from the feature attribution results that slope is one of the most influential features for delineating landslides in the study area. A map comparison of the slope feature and GBDT landslide susceptibility prediction over the study region in Fig. S5 (Online Resource 1) illustrates that the very high landslide susceptibility class correlates spatially with steep sloped areas. This agrees with results by McAdoo et al. (2000) that found that the slope grade of the resulting landslide scarp (the steep section of undisturbed material at the upper edge of the landslide area, left behind by the movement of displaced material) is anomalous relative to the adjacent area. Therefore, it is expected that the slope feature is strongly correlated to landslide observations. This makes the slope feature useful for submarine landslide identification; however, it may be misleading for assessing susceptibility. Additional environmental information is necessary to distinguish the landslide potential for areas with similar slope values, and a ML model is needed that can identify these multi-related interactions on the seafloor. Furthermore, there is a lack of knowledge regarding the timing and frequency of submarine landslides due to a lack of in situ measurements of these events and uncertainties in dating methods such as radiocarbon dating, oxygen isotope curves, and tephrochronology (Huhn et al. 2019). Without an understanding of the temporal characteristics of the submarine landslide occurrences and temporal variation in their triggers for this study area, the landslide susceptibility maps can be used to portray where submarine landslides are likely to occur spatially but cannot be used to conclude when these events will occur (Chacón et al. 2006; Reichenbach et al. 2018). Therefore, we acknowledge this temporal uncertainty and variation to be a limitation of this LSM application, in that each LSM produced with this method provides predictions based on the moment data were collected and do not reflect any changes to the bathymetry or conditions thereafter.

6 Conclusion

Submarine landslides pose a significant threat to current and future offshore infrastructure; however, LSM has seldom been applied to offshore regions to spatially forecast the potential of submarine landslides in economically prudent offshore areas. With the availability of region-wide data that are related to submarine landslide occurrences, the GoM was used as a case study for applying LSM techniques to a basin-scale region in the offshore environment. While many methods for producing landslide susceptibility maps exist, this study employed a GBDT model using a permutation approach to assess how the location and amount of training data influence model accuracy. Variability in model accuracy indicated that the GBDT model provides a more accurate model overall compared to LR and that the LSM model performance improves when training on several locations that are geologically unique.

The importance that each feature has in forecasting landslide potential is location specific; however, general features necessary for LSM in the GoM case study region can be made. Based on feature importance metrics, geomorphological settings, including basins, canyons, and escarpments, were shown to provide valuable information to forecast landslide susceptibility. Other notable influencing factors include the percentage of rock coverage, faults, and various types of gas presence. Other factors that were not considered in this study were wave height and sediment accumulation rate, which may be important in shallow environment LSM (less than approximately 120 m of water depth where extreme waves are still able to exert pressure variations at the bottom).

The results of this case study offer an initial understanding of how available offshore spatial datasets can be utilized to develop landslide susceptibility maps. By means of characterizing spatial patterns of high and low landslide potential areas, LSM can aid in the current and future planning of offshore infrastructure to help prevent and mitigate potential incidents. Future LSM studies may be influenced by climate change effects such as sea level rise and sedimentation rate as these factors become more conclusive (Urlaub et al. 2013). While this study reports the success of LSM in the northern GoM at a large spatial scale using a GBDT model, the results should be utilized as a baseline for future model improvements and extrapolated to other offshore regions.

References

Anderson, RS, Anderson, SP (2010) Geomorphology: the mechanics and chemistry of landscapes: Cambridge University Press
Brunsden, D, Prior, DB (1984) Submarine slope instability. In: Brunsden, D, Prior, DB (Ed.), Slope Instability. Baffins Lane, Chichester, Sussex England. Wiley (John) & Sons, Limited.
Buczkowski, B, Reid, J, Schweitzer, B et al (2020) usSEABED: Offshore Surficial-Sediment Database for Samples Collected within the United States Exclusive Economic Zone. Accessed 15 June 2011
Bureau of Ocean & Energy Management (2008) The offshore petroleum industry in the Gulf of Mexico. https://www.boem.gov/sites/default/files/boem-education/BOEM-Education-Images-and-Resources/TheOffshorePetroleumIndustryOrganizationalScheme.pdf. Accessed 6 July 2022
Bureau of Ocean & Energy Management (2016a) BOEM Seismic Water Bottom Anomalies - Gulf of Mexico - Gulf of Mexico NAD27. Bureau of Ocean Energy Management. New Orleans, LA. https://www.boem.gov/Seismic-Water-Bottom-Anomalies-Map-Gallery/. Accessed 8 August 2019
Bureau of Ocean & Energy Management (2016b) Gulf of Mexico Deepwater Bathymetry with Hillshade. Bureau of Ocean Energy Management. New Orleans, LA. https://www.boem.gov/Gulf-of-Mexico-Deepwater-Bathymetry. Accessed 30 May 2017
Casey, JP (2019) Danger from the deep: underwater mudslides in the Gulf of Mexico. Offshore Technology. https://www.offshore-technology.com/analysis/danger-from-the-deep-underwater-mudslides-in-the-gulf-of-mexico/. Accessed 7 July 2022
Chacón J, Irigaray C, Fernandez T et al (2006) Engineering geology maps: landslides and geographical information systems. B Eng Geol Environ 65(4):341–411. https://doi.org/10.1007/s10064-006-0064-z
Article Google Scholar
Chen, T Guestrin, C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. https://doi.org/10.1145/2939672.2939785
Chen W, Pourghasemi HR, Panahi M et al (2017) Spatial prediction of landslide susceptibility using an adaptive neuro-fuzzy inference system combined with frequency ratio, generalized additive model, and support vector machine techniques. Geomorphol 297:69–85. https://doi.org/10.1016/j.geomorph.2017.09.007
Article Google Scholar
Coleman, JM, Prior, DB Garrison, LE (1978) Submarine landslides in the Mississippi River delta. In: Proceedings of the Offshore Technology Conference. Houston, Texas, USA. https://doi.org/10.4043/3170-MS
Coleman, JM, Prior, DB Garrison, LE (1980) Subaqueous sediment instabilities in the offshore Mississippi River delta: Bureau of Land Management, New Orleans OCS Office
Collico S, Arroyo M, Urgeles R et al (2020) Probabilistic mapping of earthquake-induced submarine landslide susceptibility in the South-West Iberian margin. Mar Geol 429:106296. https://doi.org/10.1016/j.margeo.2020.106296
Article Google Scholar
Cooper AKH, PE, (2002) High-resolution seismic-reflection investigation of the northern Gulf of Mexico gas-hydrate-stability zone. Mar Pet Geol 19(10):1275–1293. https://doi.org/10.1016/S0264-8172(02)00107-1
Article CAS Google Scholar
Cover TH, P, (1967) Nearest neighbor pattern classification. IEEE Technol Inf Theory 13(1):21–27. https://doi.org/10.1109/TIT.1967.1053964
Article Google Scholar
Diegel, FA, Karlo, J, Schuster, D et al (1995) Cenozoic structural evolution and tectono-stratigraphic framework of the northern Gulf Coast continental margin. In: M. Jackson, D. RobertsS. Snelson (Eds.), Salt Tectonics: A Global Perspective. AAPG Memoir 65, pp. 109–151. https://doi.org/10.1306/M65604C6
Dyer, AS, Pantaleone, S, Mark-moser, M et al (2022) Historic submarine landslides in the northern Gulf of Mexico. Energy Data eXchange. https://doi.org/10.18141/1879673. Accessed 6 June 2022
Elith J, Leathwick JRH, T, (2008) A working guide to boosted regression trees. J Anim Ecol 77(4):802–813. https://doi.org/10.1111/j.1365-2656.2008.01390.x
Article CAS Google Scholar
Environmental Systems Research Institute (2022) ArcGIS Pro Desktop (Version Version 2.9.1). Redlands, CA
Fabbri AG, Chung C-JF, Cendrero A et al (2003) Is prediction of future landslides possible with a GIS? Nat Hazards 30(3):487–503. https://doi.org/10.1023/B:NHAZ.0000007282.62071.75
Article Google Scholar
Feseker, T, Boetius, A, Wenzhöfer, F et al (2014) Eruption of a deep-sea mud volcano triggers rapid sediment movement. Nat Commun 5(5385). https://doi.org/10.1038/ncomms6385
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
Article Google Scholar
Galloway WE (2008) Depositional evolution of the Gulf of Mexico sedimentary basin. Sediment Basins World 5:505–549. https://doi.org/10.1016/S1874-5997(08)00015-4
Article Google Scholar
Goetz J, Brenning A, Petschko H et al (2015) Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput Geosci 81:1–11. https://doi.org/10.1016/j.cageo.2015.04.007
Article Google Scholar
Harris P, Macmillan-Lawler M, Rupp J et al (2014) Geomorphology of the oceans. Mar Geol 352:4–24. https://doi.org/10.1016/j.margeo.2014.01.011
Article Google Scholar
Henkel D (1970) The role of waves in causing submarine landslides. Geotechnique 20(1):75–80
Article Google Scholar
Hitchcock, C, Givler, R, Angell, M et al. (2010). GIS-based assessment of submarine mudflow hazard offshore of the Mississippi Delta, Gulf of Mexico. In: D. Mosher, R. Shipp, L. Moscardelli, J. Chaytoret al (Eds.), Submarine Mass Movements and Their Consequences. Advances in Natural and Technological Hazards Research, Vol. 28. Springer, Dordrecht, pp. 353–364. https://doi.org/10.1007/978-90-481-3071-9_29
Huhn, K, Arroyo, M, Cattaneo, A et al (2019) Modern submarine landslide complexes: a short review. In: K. Ogata, A. FestaG. Pini (Eds.), Submarine landslides, pp. 181–200. https://doi.org/10.1002/9781119500513.ch12
IOC, IHO BODC (2003) Centenary Edition of the GEBCO Digital Atlas. Published on CD-ROM on behalf of the Intergovernmental Oceanographic Commission and the International Hydrographic Organization as part of the General Bathymetric Chart of the Oceans. British Oceanographic Data Centre, Liverpool. https://www.gebco.net/. Accessed 6 May 2022
Jenks GF (1967) The data model concept in statistical mapping. International Yearbook of Cartography 7:186–190
Google Scholar
Kaiser MJ, Yu YJ, CJ, (2009) Modeling lost production from destroyed platforms in the 2004–2005 Gulf of Mexico hurricane seasons. Energy 34(9):1156–1171. https://doi.org/10.1016/j.energy.2009.04.032
Article Google Scholar
Kraskov A, Stögbauer HG, P, (2004) Estimating mutual information. Phys Rev E 69(6):066138. https://doi.org/10.1103/PhysRevE.69.066138
Article CAS Google Scholar
Lamont-Doherty Core Repository (1977) Archive of Geosample Data and Information from the Columbia University Lamont-Doherty Earth Observatory (LDEO) Lamont-Doherty Core Repository (LDCR). NOAA National Centers for Environmental Information. https://doi.org/10.7289/V5M61H7G.Accessed3August2022
Article Google Scholar
Lundberg, SM, Erion, GG Lee, S-I (2018) Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:180203888. https://doi.org/10.48550/arXiv.1802.03888
Lundberg, SM Lee, S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the Advances in neural information processing systems.
Majumdar U, Cook AE, Scharenberg M et al (2017) Semi-quantitative gas hydrate assessment from petroleum industry well logs in the northern Gulf of Mexico. Mar Pet Geol 85:233–241. https://doi.org/10.1016/j.marpetgeo.2017.05.009
Article Google Scholar
Maloney JM, Bentley SJ, Xu K et al (2020) Mass wasting on the mississippi river subaqueous delta. Earth-Sci Rev 200:103001. https://doi.org/10.1016/j.earscirev.2019.103001
Article Google Scholar
Martin RGB, AH, (1982) Active diapirism and slope steepening, northern Gulf of Mexico continental slope. Mar Georesour Geotechnol 5(1):63–91. https://doi.org/10.1080/10641198209379837
Article Google Scholar
Masson D, Harbitz C, Wynn R et al (2006) Submarine landslides: processes, triggers and hazard prediction. Philos Transact R Soc a: Math Phys Eng Sci 364(1845):2009–2039. https://doi.org/10.1098/rsta.2006.1810
Article CAS Google Scholar
McAdoo B, Pratson LO, D, (2000) Submarine landslide geomorphology. US Continental Slope Mar Geol 169(1–2):103–136. https://doi.org/10.1016/S0025-3227(00)00050-5
Article Google Scholar
Micheletti N, Foresti L, Robert S et al (2014) Machine learning feature selection methods for landslide susceptibility mapping. Math Geosci 46:33–57. https://doi.org/10.1007/s11004-013-9511-0
Article Google Scholar
Milkov AVS, R, (2000) Thickness of the gas hydrate stability zone, Gulf of Mexico continental slope. Mar Pet Geol 17(9):981–991. https://doi.org/10.1016/S0264-8172(00)00051-9
Article CAS Google Scholar
Obelcz, J, Wood, WT, Phrampus, BJ et al (2020) Machine learning augmented time‐lapse bathymetric surveys: A case study from the Mississippi river delta front. Geophys Res Lett 47(10):e2020GL087857. https://doi.org/10.1029/2020GL087857
Offshore Energy (2018) White Paper: Using oil and gas infrastructure for energy transition. Retrieved from https://www.offshore-energy.biz/using-oil-and-gas-infrastructure-for-energy-transition/. Accessed 7 July 2022
Pampell-Manis A, Horrillo J, Shigihara Y et al (2016) Probabilistic assessment of landslide tsunami hazard for the northern Gulf of Mexico. J Geophys Res: Oceans 121(1):1009–1027. https://doi.org/10.1002/2015JC011261
Article Google Scholar
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830. https://doi.org/10.5555/1953048.2078195
Article Google Scholar
Raja NB, Çiçek I, Türkoğlu N et al (2017) Landslide susceptibility mapping of the Sera River Basin using logistic regression model. Nat Hazards 85(3):1323–1346. https://doi.org/10.1007/s11069-016-2591-7
Article Google Scholar
Reichenbach P, Rossi M, Malamud BD et al (2018) A review of statistically-based landslide susceptibility models. Earth-Sci Rev 180:60–91. https://doi.org/10.1016/j.earscirev.2018.03.001
Article Google Scholar
Restreppo GA, Wood WTP, BJ, (2020) Oceanic sediment accumulation rates predicted via machine learning algorithm: towards sediment characterization on a global scale. Geo-Mar Lett 40(5):755–763. https://doi.org/10.1007/s00367-020-00669-1
Article CAS Google Scholar
Ross BC (2014) Mutual information between discrete and continuous data sets. PLoS ONE 9(2):e87357. https://doi.org/10.1371/journal.pone.0087357
Article CAS Google Scholar
Sahin EK (2020) Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Appl Sci 2(7):1–17. https://doi.org/10.1007/s42452-020-3060-1
Article CAS Google Scholar
Sassen R, Sweet S, Milkov A et al (1999) Geology and geochemistry of gas hydrates, central Gulf of Mexico continental slope. Gulf Coast Assoc Geol Soc Trans 49:462–469
Google Scholar
Sawyer DE, Mason RA, Cook AE et al (2019) Submarine landslides induce massive waves in subsea brine pools. Sci Rep 9(1):1–9. https://doi.org/10.1038/s41598-018-36781-7
Article CAS Google Scholar
Schweitzer, B, Buczkowski, J, Reid, P et al (2020) usSEABED: Offshore surficial-sediment database for samples collected within the United States Exclusive Economic Zone. https://doi.org/10.5066/P9H3LGWM
Shahri AA, Spross J, Johansson F et al (2019) Landslide susceptibility hazard map in southwest Sweden using artificial neural network. CATENA 183:104225. https://doi.org/10.1016/j.catena.2019.104225
Article Google Scholar
Shan, Z, Guo, F, Lai, X et al (2021) Assessment of submarine landslide susceptibility in the Sea Area of Zhoushan. In: Proceedings of the IOP Conference Series: Earth and Environmental Science. Suzhou, China.
Shanmugam GW, Y, (2015) The landslide problem. J Palaeogeogr 4(2):109–166. https://doi.org/10.3724/SP.J.1261.2015.00071
Article Google Scholar
Shapley, LS (1997) A value for n-person games. In: H. KuhnA. Tucker (Eds.), Contributions to the Theory of Games II. Princeton. Princeton University Press, pp. 307–317. https://doi.org/10.1515/9781400881970-018
Shepard FP (1955) Delta-front valleys bordering the Mississippi distributaries. Geol Soc Am Bull 66(12):1489–1498. https://doi.org/10.1130/0016-7606(1955)66[1489:DVBTMD]2.0.CO;2
Article Google Scholar
Song Y, Niu R, Xu S et al (2018) Landslide susceptibility mapping based on weighted gradient boosting decision tree in Wanzhou section of the Three Gorges Reservoir Area (China). ISPRS Int J Geo-Inf 8(1):4. https://doi.org/10.3390/ijgi8010004
Article Google Scholar
Triguero I, García-Gil D, Maillo J et al (2019) Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data. Wires Data Min Knowl Discov 9(2):e1289. https://doi.org/10.1002/widm.1289
Article Google Scholar
Tripsanas EK, Bryant WRP, BA, (2004) Slope-instability processes caused by salt movements in a complex deep-water environment, Bryant Canyon area, northwest Gulf of Mexico. AAPG Bull 88(6):801–823. https://doi.org/10.1306/01260403106
Article Google Scholar
Twichell, DC, Cross, VA, Paskevich, VF et al (1995) ATMX_SED.SHP - 1995 National assessment of oil and gas resources of the United States: Sediment thickness in kilometers and mMarine Geologyeters [vector digital data]. U.S. Geological Survey, Coastal and Marine Geology Program, Woods Hole Science Center, Woods Hole, MA. http://pubs.usgs.gov/of/2005/1071/data/assessments/atmx_sed/atmx_sed.zip. Accessed 19 June 2013
Twichell, DC, Cross, VA, Paskevich, VF et al (1996) Seafloor or Short Core Hydrate Locations in the Gulf of Mexico (HYDRATES.SHP). U.S. Geological Survey, Coastal and Marine Geology Program, Woods Hole Science Center, Woods Hole, MA. https://doi.org/10.3133/ofr20051071. Accessed 19 June 2013
United States Geological Survey (2004a) Faults in the Gulf Coast [gcfaultsg]. U.S. Geological Survey data release. https://www.sciencebase.gov/catalog/item/60abc3f9d34ea221ce51e45f. Accessed 19 June 2013
United States Geological Survey (2004b) Salt Diapirs in the Gulf Coast [gcdiapirg]. U.S. Geological Survey data release. https://www.sciencebase.gov/catalog/item/60abc3e2d34ea221ce51e451. Accessed 19 June 2013
Urlaub M, Talling PJM, DG, (2013) Timing and frequency of large submarine landslides: implications for understanding triggers and future geohazard. Quat Sci Rev 72:63–82. https://doi.org/10.1016/j.quascirev.2013.04.020
Article Google Scholar
Vanneste, M, Forsberg, CF, Glimsdal, S et al (2013) Submarine landslides and their consequences: what do we know, what can we do? In: C. Margottini, P. CanutiK. Sassa (Eds.), Landslide science and practice. Berlin, Heidelberg. Springer, pp. 5–17. https://doi.org/10.1007/978-3-642-31427-8_1
Wang Y, Fang Z, Wang M et al (2020) Comparative study of landslide susceptibility mapping with different recurrent neural networks. Comput Geosci 138:104445. https://doi.org/10.1016/j.cageo.2020.104445
Article Google Scholar

Download references

Acknowledgements

This work was funded with support through the National Energy Technology Laboratory’s Environmentally Prudent Stewardship Field Work Proposal 1025020, under the Research and Innovation Center. The authors would like to express their appreciation to Dakota Zaengle for his valuable contribution to checking the quality and accuracy of this scientific research. They would also like to show appreciation for Thomas Martin for his advisement in machine learning expertise.

Funding

This project was funded by the United States Department of Energy, National Energy Technology Laboratory, in part, through a site support contract. Neither the United States Government nor any agency thereof, nor any of their employees, nor the support contractor, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

Author information

Authors and Affiliations

National Energy Technology Laboratory, 1450 Queen Avenue SW, Albany, OR, 97321, USA
Alec S. Dyer, MacKenzie Mark-Moser, Rodrigo Duran & Jennifer R. Bauer
NETL Support Contractor, 1450 Queen Avenue SW, Albany, OR, 97321, USA
Alec S. Dyer
Theiss Research, 7411 Eads Avenue, La Jolla, CA, 92037, USA
Rodrigo Duran

Authors

Alec S. Dyer
View author publications
You can also search for this author in PubMed Google Scholar
MacKenzie Mark-Moser
View author publications
You can also search for this author in PubMed Google Scholar
Rodrigo Duran
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer R. Bauer
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the conceptualization and methodology for the study. Data curation, formal analysis, investigation, and validation was performed by Alec Dyer. MacKenzie Mark-Moser provided insight for the geological characteristics of the manuscript. Rodrigo Duran provided valuable insight for the statistical and oceanographic influences contributing to the study. Jennifer Bauer, MacKenzie Mark-Moser, and Rodrigo Duran provided further support for supervision and project administration tasks. The initial draft of the manuscript was written by Alec Dyer, and the other authors, MacKenzie Mark-Moser, Rodrigo Duran, and Jennifer Bauer, provided critical edits and feedback during multiple rounds of editing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to MacKenzie Mark-Moser.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 891 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Dyer, A.S., Mark-Moser, M., Duran, R. et al. Offshore application of landslide susceptibility mapping using gradient-boosted decision trees: a Gulf of Mexico case study. Nat Hazards 120, 6223–6244 (2024). https://doi.org/10.1007/s11069-024-06492-6

Download citation

Received: 15 September 2022
Accepted: 04 February 2024
Published: 24 February 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11069-024-06492-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Offshore application of landslide susceptibility mapping using gradient-boosted decision trees: a Gulf of Mexico case study

Abstract

Similar content being viewed by others

Binary logistic regression versus stochastic gradient boosted decision trees in assessing landslide susceptibility for multiple-occurring landslide events: application to the 2009 storm event in Messina (Sicily, southern Italy)

Game-theoretic optimization of landslide susceptibility mapping: a comparative study between Bayesian-optimized basic neural network and new generation neural network models

Landslide susceptibility mapping using GIS-based statistical and machine learning modeling in the city of Sidi Abdellah, Northern Algeria

1 Introduction

2 Factors relating to submarine landslide potential

3 Case study: Gulf of Mexico

3.1 Study area

3.2 Materials and methods

3.2.1 GIS feature database

3.2.2 Mutual information

3.2.3 Collinearity

3.2.4 Frequency analysis

3.2.5 Predictive modelling

3.2.6 Feature attribution

3.2.7 Landslide susceptibility mapping

4 Results

4.1 Mutual information

4.2 Collinearity

4.3 Model evaluation

4.4 Landslide susceptibility predictors

4.5 Landslide susceptibility mapping

5 Discussion

6 Conclusion

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 891 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation