Abstract
Debris flow has always been a serious problem in mountainous areas. Accurate debris flow susceptibility (DFS) assessment and interpretable prediction results play an important role in the prevention and control of debris flow disasters. Some commonly used machine learning algorithms based on Boosting ensemble techniques were widely used in the study of geohazard susceptibility due to its excellent predictive ability. However, the Categorical Boosting (CatBoost) and Natural Gradient Boosting (NGBoost) have not yet been applied in the field of DFS assessment, and few geohazard studies systematically compare and research these boosting-based algorithms. Meanwhile, previous researches have mostly focused on comparing the predictive ability of algorithms, identifying the susceptibility zones of the entire study area, and ranking the importance of the indicators, but little thorough analysis of the relationship between the indicators and debris flow susceptibility on different types of construction land. The aims of this study were to explore the optimal boosting-based DFS model, and the distribution characteristics and change rules of DFS in the study area, so as to provide decision supports for debris flow disaster prevention and reduction. This was the first time that six boosting-based machine learning algorithms have been compared in the study of DFS assessment. After determining the optimal model, the change rules of indicators in the entire study area and two types of construction lands under different DFS levels were studied respectively. An eXplainable Artifcial Intelligence (XAI) method called SHapley Additive exPlantations (SHAP), combined with zonal statistics function in geographic information system (GIS) were adopted to explore how each indicator affects the occurrence of debris flows. The results showed that the CatBoost performed best and provided the most reasonable DFS result among six boosting-based models. We found that debris flows were more likely to occur along rivers and construction lands at low altitude. Rural areas faced more stronger pressure from rainfall and were featured by worse disaster-breeding environment than urban areas. This research enriches the application of machine learning in DFS assessment, explores the changing trends of indicators between different DFS levels, and provides suggestions for better debris flow disaster prevention and mitigation management.
Similar content being viewed by others
References
Abedini M, Ghasemyan B, Mogaddam MHR (2017) Landslide susceptibility mapping in Bijar city, Kurdistan Province, Iran: a comparative study by logistic regression and AHP models. Environ Earth Sci 76(8):308. https://doi.org/10.1007/s12665-017-6502-3
Adnan MSG, Rahman MS, Ahmed N, Ahmed B, Rabbi MF, Rahman RM (2020) Improving spatial agreement in machine learning-based landslide susceptibility mapping. Remote Sens 12(20):3347. https://doi.org/10.3390/rs12203347
Ahmed B, Dewan A (2017) Application of bivariate and multivariate statistical techniques in landslide susceptibility modeling in Chittagong city corporation Bangladesh. Remote Sens 9(4):304. https://doi.org/10.3390/rs9040304
Aydin HE, Iban MC (2022) Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley additive exPlanations. Nat Hazards 16:2957–2991. https://doi.org/10.1007/s11069-022-05793-y
Bregoli F, Medina V, Chevalier G, Huerlimann M, Bateman A (2015) Debris-flow susceptibility assessment at regional scale: validation on an alpine environment. Landslides 12(3):437–454. https://doi.org/10.1007/s10346-014-0493-x
Cao J, Zhang Z, Du J, Zhang LL, Song Y, Sun G (2020) Multi-geohazards susceptibility mapping based on machine learning-a case study in Jiuzhaigou China. Nat Hazards 102(3):851–871. https://doi.org/10.1007/s11069-020-03927-8
Castellanos Abella EA, Van Westen CJ (2007) Generation of a landslide risk index map for Cuba using spatial multi-criteria evaluation. Landslides 4(4):311–325. https://doi.org/10.1007/s10346-007-0087-y
Chen W, Shirzadi A, Shahabi H, Bin Ahmad B, Zhang S, Hong HY, Zhang N (2017) A novel hybrid artificial intelligence approach based on the rotation forest ensemble and naive Bayes tree classifiers for a landslide susceptibility assessment in Langao County China. Geomat Nat Haz Risk 8(2):1955–1977. https://doi.org/10.1080/19475705.2017.1401560
Chen W, Peng JB, Hong HY, Shahabi H, Pradhan B, Liu JZ, Zhu AX, Pei XJ, Duan Z (2018) Landslide susceptibility modelling using GIS-based machine learning techniques for Chongren County, Jiangxi Province, China. Sci Total Environ 626:1121–1135. https://doi.org/10.1016/j.scitotenv.2018.01.124
Chen JL, Huang GR, Chen WJ (2021) Towards better flood risk management: assessing flood risk and investigating the potential mechanism based on machine learning models. J Environ Manag 293:112810. https://doi.org/10.1016/j.jenvman.2021.112810
Chen TQ, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (KDD), San Francisco, CA, pp 785–794. https://doi.org/10.1145/2939672.2939785
Cheng JY, Dai XA, Wang ZK, Li JZ, Qu G, Li WL, She JX, Wang YL (2022) Landslide susceptibility assessment model construction using typical machine learning for the three gorges reservoir area in China. Remote Sens 14(9):2257. https://doi.org/10.3390/rs14092257
Di Cristo C, Iervolino M, Vacca A (2018) Applicability of kinematic and diffusive models for mud-flows: a steady state analysis. J Hydrol 559:585–595. https://doi.org/10.1016/j.jhydrol.2018.02.016
Dorogush AV, Ershov V, Gulin A (2017) CatBoost: gradient boosting with categorical feature support. https://arxiv.org/abs/1810.11363
Dou J, Yunus AP, Bui DT, Merghadi A, Sahana M, Zhu ZF, Chen CW, Han Z, Pham BT (2020) Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed Japan. Landslides 17(3):641–658. https://doi.org/10.1007/s10346-019-01286-5
Duan T, Avati A, Ding DY, Thai KK, Basu S, Ng AY, Schuler A (2019) NGBoost: natural gradient boosting for probabilistic prediction. https://arxiv.org/abs/1910.03225
Fang ZC, Wang Y, Peng L, Hong HY (2021) A comparative study of heterogeneous ensemble-learning techniques for landslide susceptibility mapping. Int J Geogr Inf Sci 35(2):321–347. https://doi.org/10.1080/13658816.2020.1808897
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Frattini P, Crosta G, Carrara A (2010) Techniques for evaluating the performance of landslide susceptibility models. Eng Geol 111(1–4):62–72. https://doi.org/10.1016/j.enggeo.2009.12.004
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning theory, San Francisco, CA
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. https://doi.org/10.1006/jcss.1997.1504
Golovko D, Roessner S, Behling R, Wetzel HU, Kleinschmit B (2017) Evaluation of remote-sensing-based landslide inventories for hazard assessment in Southern Kyrgyzstan. Remote Sens 9(9):943. https://doi.org/10.3390/rs9090943
Guo JQ, Yang L, Bie RF, Yu JG, Gao Y, Shen Y, Kos A (2019) An XGBoost-based physical fitness evaluation model using advanced feature selection and Bayesian hyper-parameter optimization for wearable running monitoring. Comput Netw 151:166–180. https://doi.org/10.1016/j.comnet.2019.01.026
Hong HY, Liu JZ, Bui DT, Pradhan B, Acharya TD, Pham BT, Zhu AX, Chen W, Bin Ahmad B (2018) Landslide susceptibility mapping using J48 decision tree with AdaBoost, bagging and rotation forest ensembles in the Guangchang area (China). Catena 163:399–413. https://doi.org/10.1016/j.catena.2018.01.005
Hu XD, Mei HB, Zhang H, Li YY, Li MD (2021) Performance evaluation of ensemble learning techniques for landslide susceptibility mapping at the Jinping county Southwest China. Nat Hazards 105(2):1663–1689. https://doi.org/10.1007/s11069-020-04371-4
Huang FM, Cao ZS, Jiang SH, Zhou CB, Huang JS, Guo ZZ (2020) Landslide susceptibility prediction based on a semi-supervised multiple-layer perceptron model. Landslides 17(2):2919–2930. https://doi.org/10.1007/s10346-020-01473-9
Iwata K, Ikeda K, Sakai H (2004) A new criterion using information gain for action selection strategy in reinforcement learning. IEEE Trans Neural Netw 15:792–799. https://doi.org/10.1109/TNN.2004.828760
Jin K, Chen JG, Chen XQ, Zhao WY, Si GW, Gong XL (2021) Impact failure models and application condition of trees in debris-flow hazard mitigation. J Mt Sci 18(7):1874–1885. https://doi.org/10.1007/s11629-020-6510-8
Kavzoglu T, Teke A (2022) Predictive performances of ensemble machine learning algorithms in landslide susceptibility mapping using random forest, extreme gradient boosting (XGBoost) and natural gradient boosting (NGBoost). Arab J Sci Eng 47(6):7367–7385. https://doi.org/10.1007/s13369-022-06560-8
Kavzoglu T, Sahin EK, Colkesen I (2014) Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression. Landslides 11(3):425–439. https://doi.org/10.1007/s10346-013-0391-7
Khosravi K, Pham BT, Chapi K, Shirzadi A, Shahabi H, Revhaug I, Prakash I, Bui DT (2018) A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Sci Total Environ 627:744–755. https://doi.org/10.1016/j.scitotenv.2018.01.266
Li CJ, Li SC, Lv BS (2007) Discuss of the rainstorm flood characteristic and reservoir function in 2000 of Tumen River basin. J Agric Sci Yanbian Univ 29:170–173
Li Q, Liu GB, Zhang Z, Tuo DF, Bai RR, Qiao FF (2017a) Relative contribution of root physical enlacing and biochemistrical exudates to soil erosion resistance in the loess soil. Catena 153:61–65. https://doi.org/10.1016/j.catena.2017.01.037
Li YY, Wang HG, Chen JP, Shang YJ (2017b) Debris flow susceptibility assessment in the wudongde dam area, china based on rock engineering system and fuzzy C-means algorithm. Water 9(9):669. https://doi.org/10.3390/w9090669
Li L, Wang CY, Li W, Chen JB (2018) Hyperspectral image classification by AdaBoost weighted composite kernel extreme learning machines. Neurocomputing 275:1725–1733. https://doi.org/10.1016/j.neucom.2017.09.004
Li SS, Wang ZL, Lai CG, Lin GS (2020) Quantitative assessment of the relative impacts of climate change and human activity on flood susceptibility based on a cloud model. J Hydrol 588:125051. https://doi.org/10.1016/j.jhydrol.2020.125051
Li Y, Chen W, Rezaie F, Rahmati O, Moghaddam DD, Tiefenbacher J, Panahi M, Lee MJ, Kulakowski D, Bui DT, Lee S (2022) Debris flows modeling using geo-environmental factors: developing hybridized deep-learning algorithms. Geocarto Int 37(17):5150–5173. https://doi.org/10.1080/10106049.2021.1912194
Liang WJ, Zhuang DF, Jiang D, Pan JJ, Ren HY (2012) Assessment of debris flow hazards using a Bayesian network. Geomorphology 171:94–100. https://doi.org/10.1016/j.geomorph.2012.05.008
Liou Y-A, Nguyen AK, Li M-H (2017) Assessing spatiotemporal eco-environmental vulnerability by Landsat data. Ecol Indic 80:52–65. https://doi.org/10.1016/j.ecolind.2017.04.055
Liu YQ, Chen JP, Sun XH, Li YC, Zhang YW, Xu WL, Yan JH, Ji YP, Yang Q (2023) A progressive framework combining unsupervised and optimized supervised learning for debris flow susceptibility assessment. Catena 234:107560. https://doi.org/10.1016/j.catena.2023.107560
Lundberg S, Lee SI (2017) A unified approach to interpreting model predictions. https://arxiv.org/abs/1705.07874
Ma XJ, Sha JL, Wang DH, Yu YB, Yang Q, Niu XQ (2018) Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning. Electron Commer R A 31:24–39. https://doi.org/10.1016/j.elerap.2018.08.002
Mao YM, Zhang MS, Sun PP, Wang GL (2017) Landslide susceptibility assessment using uncertain decision tree model in loess areas. Environ Earth Sci 76(22):752. https://doi.org/10.1007/s12665-017-7095-6
Mienye ID, Sun YX (2022) A survey of ensemble learning: concepts, algorithms, applications, and prospects. IEEE Access 10:99129–99149. https://doi.org/10.1109/ACCESS.2022.3207287
Moore ID, Grayson RB, Ladson AR (1991) Digital terrain modelling: a review of hydrological, geomorphological, and biological applications. Hydrol Process 5(1):3–30. https://doi.org/10.1002/hyp.3360050103
Nath SK, Sengupta A, Srivastava A (2021) Remote sensing GIS-based landslide susceptibility & risk modeling in Darjeeling-Sikkim Himalaya together with FEM-based slope stability analysis of the terrain. Nat Hazards 108(3):3271–3304. https://doi.org/10.1007/s11069-021-04823-5
Nicu IC, Asandulesei A (2018) GIS-based evaluation of diagnostic areas in landslide susceptibility analysis of Bahluiet River basin (Moldavian Plateau, NE Romania). are neolithic sites in danger? Geomorphology 314:27–41. https://doi.org/10.1016/j.geomorph.2018.04.010
Oh HJ, Lee S (2017) Shallow landslide susceptibility modeling using the data mining models artificial neural network and boosted tree. Appl Sci 7(10):1000. https://doi.org/10.3390/app7101000
Orhan O, Bilgilioglu SS, Kaya Z, Ozcan AK, Bilgilioglu H (2022) Assessing and mapping landslide susceptibility using different machine learning methods. Geocarto Int 37(10):2795–2820. https://doi.org/10.1080/10106049.2020.1837258
Pandey VK, Sharma KK, Pourghasemi HR, Bandooni SK (2019) Sedimentological characteristics and application of machine learning techniques for landslide susceptibility modelling along the highway corridor Nahan to Rajgarh (Himachal Pradesh). India Catena 182:104150. https://doi.org/10.1016/j.catena.2019.104150
Park S, Kim J (2019) Landslide susceptibility mapping based on random forest and boosted regression tree models, and a comparison of their performance. Appl Sci 9(5):942. https://doi.org/10.3390/app9050942
Pham BT, Bui DT, Dholakia MB, Prakash I, Pham HV, Mehmood K, Le HQ (2017a) A novel ensemble classifier of rotation forest and Naive Bayer for landslide susceptibility assessment at the Luc Yen district, Yen Bai Province (Viet Nam) using GIS. Geomat Nat Haz Risk 8(2):649–671. https://doi.org/10.1080/19475705.2016.1255667
Pham BT, Bui DT, Pourghasemi HR, Indra P, Dholakia MB (2017b) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol 128(1–2):255–273. https://doi.org/10.1007/s00704-015-1702-9
Pham BT, Shirzadi A, Shahabi H, Omidvar E, Singh SK, Sahana M, Asl DT, Bin Ahmad B, Quoc NK, Lee S (2019) Landslide susceptibility assessment by novel hybrid machine learning algorithms. Sustainability 11(16):4386. https://doi.org/10.3390/su11164386
Pourghasemi HR, Kerle N (2016) Random forests and evidential belief function-based landslide susceptibility assessment in Western Mazandaran Province Iran. Environ Earth Sci 75(3):185. https://doi.org/10.1007/s12665-015-4950-1
Pourghasemi HR, Gayen A, Park S, Lee C-W, Lee S (2018) Assessment of landslide-prone areas and their zonation using logistic regression, LogitBoost, and NaïveBayes machine-learning algorithms. Sustainability 10(10):3697. https://doi.org/10.3390/su10103697
Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A (2018) CatBoost: unbiased boosting with categorical features. https://arxiv.org/abs/1706.09516
Qi TJ, Zhao Y, Meng XM, Chen G, Dijkstra T (2021) AI-Based susceptibility analysis of shallow landslides induced by heavy rainfall in Tianshui China. Remote Sens 13(09):1819. https://doi.org/10.3390/rs13091819
Roy J, Saha S, Arabameri A, Blaschke T, Bui DT (2019) A novel ensemble approach for landslide susceptibility mapping (LSM) in Darjeeling and Kalimpong Districts, West Bengal India. Remote Sens 11(23):2866. https://doi.org/10.3390/rs11232866
Shapley LS (1953) Stochastic games. Proc Natl Acad Sci USA 39(10):1095–1100. https://doi.org/10.1073/pnas.39.10.1095
Shirvani Z (2020) A holistic analysis for landslide susceptibility mapping applying geographic object-based random forest: a comparison between protected and non-protected forests. Remote Sens 12(3):434. https://doi.org/10.3390/rs12030434
Shirzadi A, Shahabi H, Chapi K, Bui DT, Pham BT, Shahedi K, Bin Ahmad B (2017) A comparative study between popular statistical and machine learning methods for simulating volume of landslides. Catena 157:213–226. https://doi.org/10.1016/j.catena.2017.05.016
Stokes A, Sotir R, Chen W, Chestem M (2010) Soil bio- and eco-engineering in China: past experience and future priorities Preface. Ecol Eng 36(3):247–257. https://doi.org/10.1016/j.ecoleng.2009.07.008
Sun DL, Wen HJ, Wang DZ, Xu JH (2020) A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm. Geomorphology 362:107201. https://doi.org/10.1016/j.geomorph.2020.107201
Vannoppen W, De Baets S, Keeble J, Dong Y, Poesen J (2017) How do root and soil characteristics affect the erosion-reducing potential of plant species? Ecol Eng 109:186–195. https://doi.org/10.1016/j.ecoleng.2017.08.001
Vergani C, Giadrossich F, Buckley P, Conedera M, Pividori M, Salbitano F, Rauch HS, Lovreglio R, Schwarz M (2017) Root reinforcement dynamics of European coppice woodlands and their effect on shallow landslides: a review. Earth Sci Rev 167:88–102. https://doi.org/10.1016/j.earscirev.2017.02.002
Wang Y, Wen HJ, Sun DL, Li YC (2021) Quantitative assessment of landslide risk based on susceptibility mapping using random forest and GeoDetector. Remote Sens 13(13):2625. https://doi.org/10.3390/rs13132625
Xie W, Li XS, Jian WB, Yang Y, Liu HW, Robledo LF, Nie W (2021) A novel hybrid method for landslide susceptibility mapping-based GeoDetector and Machine learning cluster: a case of Xiaojin County China. Isprs Int J Geoinf 10(2):93. https://doi.org/10.3390/ijgi10020093
Xiong K, Adhikari BR, Stamatopoulos CA, Zhan Y, Wu SL, Dong ZT, Di BF (2020) Comparison of different machine learning methods for debris flow susceptibility mapping: a case study in the Sichuan Province China. Remote Sens 12(2):295. https://doi.org/10.3390/rs12020295
Yang JT, Song C, Yang Y, Xu CD, Guo F, Xie L (2019) New method for landslide susceptibility mapping supported by spatial logistic regression and GeoDetector: a case study of Duwen highway basin, Sichuan Province, China. Geomorphology 324:62–71. https://doi.org/10.1016/j.geomorph.2018.09.019
Yao JY, Qin SW, Qiao SS, Che WC, Chen Y, Su G, Miao Q (2020) Assessment of landslide susceptibility combining deep learning with semi-supervised learning in Jiaohe County, Jilin Province China. Appl Sci 10(16):5640. https://doi.org/10.3390/app10165640
Zhang YH, Ge TT, Tian W, Liou Y-A (2019) Debris flow susceptibility mapping using machine-learning techniques in Shigatse area China. Remote Sens 11(23):2801. https://doi.org/10.3390/rs11232801
Zhang Y, Wu WC, Qin YZ, Lin ZY, Zhang GL, Chen RX, Song Y, Lang T, Zhou XT, Huangfu WC, Ou PH, Xie LF, Huang XL, Peng SL, Shao CJ (2020) Mapping landslide hazard risk using random forest algorithm in Guixi, Jiangxi China. Isprs Int J Geoinf 9(11):695. https://doi.org/10.3390/ijgi9110695
Zheng XJ, Sun P, Zhu WH, Xu Z, Fu J, Man WD, Li HL, Zhang J, Qin L (2017) Landscape dynamics and driving forces of wetlands in the Tumen River basin of China over the past 50 years. Landsc Ecol Eng 13(2):237–250. https://doi.org/10.1007/s11355-016-0304-8
Zounemat-Kermani M, Batelaan O, Fadaee M, Hinkelmann R (2021) Ensemble machine learning paradigms in hydrology: a review. J Hydrol 598:126266. https://doi.org/10.1016/j.jhydrol.2021.126266
Acknowledgements
The authors would like to thank the handling editors and the anonymous reviewers for their valuable comments and suggestions, which significantly improved the quality of this paper. Acknowledgment for the data support from Resource and Environment Science and Data Center, Chinese Academy of Sciences (http://www.resdc.cn) (Accessed on 27 June 2023) and Geospatial Data Cloud site, Computer Network Information Center, Chinese Academy of Sciences (http://www.gscloud.cn) (Accessed on 27 June 2023).
Funding
This work was funded by National Natural Science Foundation of China (Grant no. 42067065).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by ZC, HQ and RJ. The first draft of the manuscript was written by ZC and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Z., Quan, H., Jin, R. et al. Debris flow susceptibility assessment based on boosting ensemble learning techniques: a case study in the Tumen River basin, China. Stoch Environ Res Risk Assess 38, 2359–2382 (2024). https://doi.org/10.1007/s00477-024-02683-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-024-02683-6