Abstract
Landslides pose a significant threat to China’s Three Gorges Reservoir area. Many ensemble learning models have been applied to landslide susceptibility mapping (LSM) in this region, as it forms the foundation of landslide risk management. However, most landslide susceptibility models lack interpretability, hindering the explanation of the relative importance and interactive mechanisms among landslide conditioning factors. This study evaluates and interprets three tree-based ensemble learning models—XGBoost, Random Forest (RF), and Light GBM—for LSM in the Yichang section of the Three Gorges Reservoir area, employing SHAP (SHapley Additive exPlanations) analysis. Among these models, XGBoost and RF exhibit similar the area under the receiver operating characteristic curve (AUROC) values of 0.96 and 0.95, slightly outperforming Light GBM with an AUROC of 0.93. We identify four crucial landslide conditioning factors from a dataset of 714 landslide samples by individual interpretation, shedding light on specific elements that drive higher susceptibility and recommending suitable mitigation measures for different landslide. Global interpretation via SHAP reveals that elevation, Normalized Difference Vegetation Index, distance from river, distance from road, slope, and lithology are the primary factors influencing landslide susceptibility. We delve deeply into the relationships among these factors, their values, and the mechanisms triggering landslides. In addition, to enhance the credibility and reliability of SHAP interpretation results, we cross-referenced these results with relevant literature on the formation mechanism of landslides in the Three Gorges Reservoir area. This study contributes to a better understanding of landslide risk management and bridges the gap between advanced machine learning models and interpretable results by introducing SHAP. Furthermore, we augment the SHAP analysis results with domain-specific expertise in the field of landslides, helping to bridge the potential shortcomings of SHAP as a data-driven-based approach.
Similar content being viewed by others
Data availability
Some codes generated during the study are available in the following repository. Contact: e-mail: liubo_lb68@163.com. Software required: Python3.7. The partial source codes are available for downloading at the link: https://github.com/liubo-lb68/LSM/blob/main/SHAP
References
Bordoloi S, Ng CWW (2020) The effects of vegetation traits and their stability functions in bio-engineered slopes: a perspective review. Eng Geol 275:105742. https://doi.org/10.1016/j.enggeo.2020.105742
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Bui DT, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13(2):361–378. https://doi.org/10.1007/s10346-015-0557-6
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. https://doi.org/10.1145/2939672.2939785
Chen W, Zhang S (2021) GIS-based comparative study of Bayes network, Hoeffding tree and logistic model tree for landslide susceptibility modeling. CATENA 203:105344. https://doi.org/10.1016/j.catena.2021.105344
Chen T, Zhu L, Niu R-Q, Trinder CJ, Peng L, Lei T (2020) Mapping landslide susceptibility at the Three Gorges Reservoir, China, using gradient boosting decision tree, random forest and information value models. J Mt Sci 17(3):670–685. https://doi.org/10.1007/s11629-019-5839-3
Chen L, Guo H, Gong P, Yang Y, Zuo Z, Gu M (2021) Landslide susceptibility assessment using weights-of-evidence model and cluster analysis along the highways in the Hubei section of the Three Gorges Reservoir Area. Comput Geosci 156:104899. https://doi.org/10.1016/j.cageo.2021.104899
Fang Z, Wang Y, Peng L, Hong H (2020) Integration of convolutional neural network and conventional machine learning classifiers for landslide susceptibility mapping. Comput Geosci 139:104470. https://doi.org/10.1016/j.cageo.2020.104470
Froude MJ, Petley DN (2018) Global fatal landslide occurrence from 2004 to 2016. Nat Hazard 18(8):2161–2181. https://doi.org/10.5194/nhess-18-2161-2018
Gautam P, Kubota T, Sapkota LM, Shinohara Y (2021) Landslide susceptibility mapping with GIS in high mountain area of Nepal: a comparison of four methods. Environ Earth Sci 80:1–18. https://doi.org/10.1007/s12665-021-09650-2
Goetz JN, Brenning A, Petschko H, Leopold P (2015) Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput Geosci 81:1–11. https://doi.org/10.1016/j.cageo.2015.04.007
Highland LM, Bobrowsky P (2008) The landslide handbook-A guide to understanding landslides, US Geological Survey
Hu X, Huang C, Mei H, Zhang H (2021) Landslide susceptibility mapping using an ensemble model of Bagging scheme and random subspace–based naïve Bayes tree in Zigui County of the Three Gorges Reservoir Area, China. Bull Eng Geol Env 80(7):5315–5329. https://doi.org/10.1007/s10064-021-02275-6
Huang F, Zhang J, Zhou C, Wang Y, Huang J, Zhu L (2020) A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 17(1):217–229. https://doi.org/10.1007/s10346-019-01274-9
Kavzoglu T, Teke A (2022) Predictive Performances of ensemble machine learning algorithms in landslide susceptibility mapping using random forest, extreme gradient boosting (XGBoost) and natural gradient boosting (NGBoost). Arab J Sci Eng 47(6):7367–7385. https://doi.org/10.1007/s13369-022-06560-8
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30:3146–3154
Li S, Xu Q, Tang M, Iqbal J, Liu J, Zhu X, Liu F, Zhu D (2019) Characterizing the spatial distribution and fundamental controls of landslides in the three gorges reservoir area, China. Bull Eng Geol Env 78(6):4275–4290. https://doi.org/10.1007/s10064-018-1404-5
Li S, Xu Q, Tang M (2020a) Study on spatial distribution and key influencing factors of landslides in Three Gorges Reservoir area. Earth Sci 45(1):341–354. https://doi.org/10.3799/dqkx.2017.576
Li Y, Wang X, Mao H (2020b) Influence of human activity on landslide susceptibility development in the Three Gorges area. Nat Hazards 104:2115–2151. https://doi.org/10.1007/s11069-020-04264-6
Liu S, Wang L, Zhang W, Sun W, Fu J, Xiao T, Dai Z (2023) A physics-informed data-driven model for landslide susceptibility assessment in the Three Gorges Reservoir Area. Geosci Front 14(5):101621
Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems
Masi EB, Segoni S, Tofani V (2021) Root reinforcement in slope stability models: a review. Geosciences 11(5):212. https://doi.org/10.3390/geosciences11050212
Miao F, Zhao F, Wu Y, Li L, Török Á (2023) Landslide susceptibility mapping in Three Gorges Reservoir area based on GIS and boosting decision tree model. Stoch Env Res Risk Assess 37(6):2283–2303. https://doi.org/10.1007/s00477-023-02394-4
Murgia I, Giadrossich F, Mao Z, Cohen D, Capra GF, Schwarz M (2022) Modeling shallow landslides and root reinforcement: a review. Ecol Eng 181:106671. https://doi.org/10.1016/j.ecoleng.2022.106671
Panchal S, Shrivastava AK (2022) Landslide hazard assessment using analytic hierarchy process (AHP): a case study of National Highway 5 in India. Ain Shams Eng J 13(3):101626. https://doi.org/10.1016/j.asej.2021.10.021
Parsa AB, Movahedi A, Taghipour H, Derrible S, Mohammadian AK (2020) Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. Accid Anal Prev 136:105405. https://doi.org/10.1016/j.aap.2019.105405
Peng L, Niu R, Huang B, Wu X, Zhao Y, Ye R (2014) Landslide susceptibility mapping based on rough set theory and support vector machines: a case of the Three Gorges area, China. Geomorphology 204:287–301. https://doi.org/10.1016/j.geomorph.2013.08.013
Pourghasemi HR, Rahmati O (2018) Prediction of the landslide susceptibility: which algorithm, which precision? CATENA 162:177–192. https://doi.org/10.1016/j.catena.2017.11.022
Pradhan B (2012) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51(2):350–365. https://doi.org/10.1016/j.cageo.2012.08.023
Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180:60–91. https://doi.org/10.1016/j.earscirev.2018.03.001
Sahin EK (2020) Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping. Geocarto Int. https://doi.org/10.1080/10106049.2020.1831623
Segoni S, Pappafico G, Luti T, Catani F (2020) Landslide susceptibility assessment in complex geological settings: sensitivity to geological information and insights on its parameterization. Landslides 17(10):2443–2453. https://doi.org/10.1007/s10346-019-01340-2
Song Y, Niu R, Xu S, Ye R, Peng L, Guo T, Li S, Chen T (2018) Landslide susceptibility mapping based on weighted gradient boosting decision tree in Wanzhou section of the Three Gorges Reservoir Area (China). Int J Geo-Inf 8(1):4. https://doi.org/10.3390/ijgi8010004
Štrumbelj E, Kononenko IJK (2014) Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst 41(3):647–665. https://doi.org/10.1007/s10115-013-0679-x
Sultana N, Tan SK (2021) Landslide mitigation strategies in southeast Bangladesh: lessons learned from the institutional responses. Int J Disaster Risk Reduct 62:18. https://doi.org/10.1016/j.ijdrr.2021.102402
Sun D, Ding Y, Zhang J, Wen H, Wang Y, Xu J, Zhou X, Liu R (2022) Essential insights into decision mechanism of landslide susceptibility mapping based on different machine learning models. Geocarto Int. https://doi.org/10.1080/10106049.2022.2146763
Sun X, Chen J, Han X, Bao Y, Zhan J, Peng W (2020) Application of a GIS-based slope unit method for landslide susceptibility mapping along the rapidly uplifting section of the upper Jinsha River, South-Western China. Bull Eng Geol Env 79(1):533–549. https://doi.org/10.1007/s10064-019-01572-5
Tanyas H, Rossi M, Alvioli M, van Westen CJ, Marchesini I (2019) A global slope unit-based method for the near real-time prediction of earthquake-induced landslides. Geomorphology 327:126–146. https://doi.org/10.1016/j.geomorph.2018.10.022
Wang Y, Fang Z, Wang M, Peng L, Hong H (2020) Comparative study of landslide susceptibility mapping with different recurrent neural networks. Comput Geosci 138:104445. https://doi.org/10.1016/j.cageo.2020.104445
Wang H, Xu J, Tan S, Zhou J (2023) Landslide susceptibility evaluation based on a coupled informative-logistic regression model—Shuangbai County as an Example. Sustainability 15(16):12449. https://doi.org/10.3390/su151612449
Wu S, Chen J, Xu C, Zhou W, Yao L, Yue W, Cui Z (2020) Susceptibility assessments and validations of debris-flow events in Meizoseismal areas: case study in China’s Longxi River watershed. Nat Hazard Rev 21(1):05019005. https://doi.org/10.1061/(ASCE)NH.1527-6996.0000347
Xiao H, Huang J, Ma Q, Wan J, Li L, Peng Q, Rezaeimalek S (2017) Experimental study on the soil mixture to promote vegetation for slope protection and landslide prevention. Landslides 14:287–297. https://doi.org/10.1007/s10346-015-0634-x
Xiao T, Yin K, Yao T, Liu S (2019) Spatial prediction of landslide susceptibility using GIS-based statistical and machine learning models in Wanzhou County, Three Gorges Reservoir, China. Acta Geochimica 38(5):654–669. https://doi.org/10.1007/s11631-019-00341-1
Xiao T, Segoni S, Chen L, Yin K, Casagli N (2020) A step beyond landslide susceptibility maps: a simple method to investigate and explain the different outcomes obtained by different approaches. Landslides 17(3):627–640. https://doi.org/10.1007/s10346-019-01299-0
Zhou C, Yin K, Cao Y, Ahmed B, Li Y, Catani F, Pourghasemi HR (2018) Landslide susceptibility modeling applying machine learning methods: a case study from Longju in the Three Gorges Reservoir area, China. Comput Geosci 112:23–37. https://doi.org/10.1016/j.cageo.2017.11.019
Acknowledgements
We would like to express our gratitude to the Geological Environmental Center of Hubei Province for providing crucial spatial data, including the coordinates of 714 landslide samples, which greatly contributed to this research. Their cooperation and support were instrumental in the success of this study.
Furthermore, we extend our appreciation to the editorial team of the journal and the anonymous reviewers for their valuable feedback and suggestions during the review process. Their input significantly improved the quality and completeness of this paper.
Funding
This research is supported by the National Natural Science Foundation of China [Grant Numbers 72074198, 71874165]; the National Social Science Fund of China [grant numbers 23AZD072]; Key Research and Development Program of Hubei Province [Grant Number 2021BCA219]; Young Talents Foundation of The Central Propaganda Department [Grant Number 2020084007]; the Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) [Grant Number CUG2642022006].
Author information
Authors and Affiliations
Contributions
BL involved in conceptualization, methodology, software, writing—original draft, and investigation. HG involved in validation, resources, and writing—review and editing. JL involved in resources and writing—review and editing. XK involved in writing—review and editing. XH involved in software.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix 1: The Pearson correlation coefficient between pairs of 15 landslide conditioning factors
Elevation | Aspect | Slope | Plan curvature | Profile curvature | TRI | Surface roughness | TWI | SPI | Distance from river | Land use | Distance from road | Distance from fault | Lithology | NDVI | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Elevation | 1.00 | 0.01 | 0.20 | 0.00 | − 0.10 | 0.22 | 0.15 | − 0.12 | 0.03 | 0.48 | 0.36 | 0.43 | − 0.05 | 0.18 | 0.48 |
Aspect | 1.00 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.03 | 0.03 | − 0.02 | 0.07 | − 0.01 | 0.01 | 0.01 | − 0.03 | |
Slope | 1.00 | 0.02 | − 0.01 | 0.75 | 0.89 | − 0.32 | 0.35 | − 0.05 | 0.35 | 0.10 | − 0.07 | 0.01 | 0.27 | ||
Plan Curvature | 1.00 | − 0.35 | 0.01 | 0.03 | − 0.41 | − 0.46 | − 0.06 | 0.00 | − 0.02 | − 0.01 | − 0.01 | − 0.03 | |||
Profile Curvature | 1.00 | − 0.03 | − 0.03 | 0.24 | 0.25 | − 0.03 | − 0.05 | − 0.02 | − 0.02 | 0.03 | − 0.02 | ||||
TRI | 1.00 | 0.71 | − 0.16 | 0.33 | − 0.04 | 0.34 | 0.11 | − 0.06 | − 0.03 | 0.28 | |||||
Surface Roughness | 1.00 | − 0.25 | 0.27 | − 0.04 | 0.27 | 0.10 | − 0.06 | 0.00 | 0.23 | ||||||
TWI | 1.00 | 0.72 | − 0.01 | − 0.11 | − 0.04 | 0.01 | − 0.06 | − 0.07 | |||||||
SPI | 1.00 | − 0.03 | 0.12 | 0.01 | − 0.06 | − 0.03 | 0.10 | ||||||||
Distance from River | 1.00 | 0.08 | 0.29 | − 0.02 | 0.12 | 0.33 | |||||||||
Land Use | 1.00 | 0.20 | − 0.03 | 0.15 | 0.44 | ||||||||||
Distance from Road | 1.00 | 0.08 | 0.09 | 0.47 | |||||||||||
Distance from Fault | 1.00 | − 0.04 | − 0.04 | ||||||||||||
Lithology | 1.00 | 0.29 | |||||||||||||
NDVI | 1.00 |
Appendix 2: Landslide conditioning factors
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, B., Guo, H., Li, J. et al. Application and interpretability of ensemble learning for landslide susceptibility mapping along the Three Gorges Reservoir area, China. Nat Hazards 120, 4601–4632 (2024). https://doi.org/10.1007/s11069-023-06374-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11069-023-06374-3