Abstract
This study aims to explore the influence of the training sample selection method and the presence and absence sample ratio on the study of landslide susceptibility by the logistic stepwise regression (LSR) model. Eight-kilometer range of the main road in Wenchuan County is taken as the research area, and the neighboring Mao County is considered to test the robustness of LSR models. The performance of the models obtained by randomly selecting (RS) 70% of the data as the training sample from the sample set and selecting the data as the training sample by tenfold cross-validation (CV) is compared. At the same time, the difference in model performance when the number of absence samples in the training samples is 1, 5, 7, and 10 times that of the presence samples is also studied. The results show that (1) the CV-LSR model is not only better than the RS-LSR model in terms of model prediction accuracy, but is also more stable and effective in factor selection; (2) when the number of absence samples in the training sample is too large, the model will underestimate the landslide susceptibility, and when the number of presence and absence samples is the same, the prediction effect of the model is the best.
Similar content being viewed by others
References
Achour Y, Boumezbeur A, Hadji R, Chouabbi A, Cavaleiro V, Bendaoud EA (2017) Landslide susceptibility mapping using analytic hierarchy process and information value methods along a highway road section in Constantine Algeria. Arab J Geosci 10(8):194. https://doi.org/10.1007/s12517-017-2980-6
Achour Y, Garçia S, Cavaleiro V (2018) GIS-based spatial prediction of debris flows using logistic regression and frequency ratio models for Zêzere River basin and its surrounding area Northwest Covilhã Portugal. Arab J Geosci 11(18):1–17. https://doi.org/10.1007/s12517-018-3920-9
Achour Y, Pourghasemi HR (2020) How do machine learning techniques help in increasing accuracy of landslide susceptibility maps? Geosci Front 11:871–883. https://doi.org/10.1016/j.gsf.2019.10.001
Akgun A (2012) A comparison of landslide susceptibility maps produced by logistic regression multi-criteria decision and likelihood ratio methods: a case study at İzmir Turkey. Landslides 9:93–106. https://doi.org/10.1007/s10346-011-0283-7
Ayalew L, Yamagishi H (2005) The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains Central Japan. Geomorphology 65:15–31. https://doi.org/10.1016/j.geomorph.2004.06.010
Bhandary NP, Dahal RK, Timilsina M, Yatabe R (2013) Rainfall event-based landslide susceptibility zonation mapping. Nat Hazards 69:365–388. https://doi.org/10.1007/s11069-013-0715-x
Bordoni M, Galanti Y, Bartelletti C, Persichillo MG, Barsanti M, Giannecchini R, Avanzi GD, Cevasco A, Brandolini P, Galve JP, Meisina C (2020) The influence of the inventory on the determination of the rainfall-induced shallow landslides susceptibility using generalized additive models. CATENA 193:104630. https://doi.org/10.1016/j.catena.2020.104630
Dao DV, Jaafari A, Bayat M, Mafi-Gholami D, Qi C, Moayedi H, Phong TV, Ly H, Le T, Trinh PT, Luu C, Quoc NK, Thanh BN, Pham BT (2020) A spatially explicit deep learning neural network model for the prediction of landslide susceptibility. CATENA 188:104451. https://doi.org/10.1016/j.catena.2019.104451
Dou J, Yunus AP, Bui DT, Merghadi A, Sahana M, Zhu Z, Chen C, Khosravi K, Yang Y, Pham BT (2019) Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island Japan. Sci Total Environ 662:332–346
Ermini L, Catani F, Casagli N (2005) Artificial neural networks applied to landslide susceptibility assessment. Geomorphology 66:327–343. https://doi.org/10.1016/j.geomorph.2004.09.025
Ghorbanzadeh O, Shahabi H, Mirchooli F, Valizadeh Kamran K, Lim S, Aryal J, Jarihani B, Blaschke T (2020) Gully erosion susceptibility mapping (GESM) using machine learning methods optimized by the multi-collinearity analysis and K-fold cross-validation. Geomatics Nat Hazards Risk 11:1653–1678. https://doi.org/10.1080/19475705.2020.1810138
Gokceoglu MEC (2002) Assessment of landslide susceptibility for a landslide-prone area (north of Yenice NW Turkey) by fuzzy approach. Environ Geol 41:720–730. https://doi.org/10.1007/s00254-001-0454-2
Hadji R, Achour Y, Hamed Y (2017) Using GIS and RS for slope movement susceptibility mapping: comparing AHP LI and LR methods for the Oued Mellah Basin NE Algeria. Euro-Mediterranean conference for environmental integration. Springer, Cham, pp 1853–1856
Heckmann T, Gegg K, Gegg A, Becht M (2014) Sample size matters: investigating the effect of sample size on a logistic regression susceptibility model for debris flows. Nat Hazard Earth Sys 14:259–278. https://doi.org/10.5194/nhess-14-259-2014
Hong H, Miao Y, Liu J, Zhu A (2019) Exploring the effects of the design and quantity of absence data on the performance of random forest-based landslide susceptibility mapping. CATENA 176:45–64. https://doi.org/10.1016/j.catena.2018.12.035
Huang RQ, Li WL (2009) Analysis of the geo-hazards triggered by the 12 May 2008 Wenchuan Earthquake China. B Eng Geol Environ 68:363–371. https://doi.org/10.1007/s10064-009-0207-0
Jiang P, Chen J (2016) Displacement prediction of landslide based on generalized regression neural networks with K-fold cross-validation. Neurocomputing 198:40–47. https://doi.org/10.1016/j.neucom.2015.08.118
Kalantar B, Pradhan B, Naghibi SA, Motevalli A, Mansor S (2018) Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM) logistic regression (LR) and artificial neural networks (ANN). Geomatics Nat Hazards Risk 9:49–69. https://doi.org/10.1080/19475705.2017.1407368
Kim SB, Huo X, Tsui K (2009) A finite-sample simulation study of cross validation in tree-based models. Inf Technol Manag 10:223–233. https://doi.org/10.1007/s10799-009-0052-7
Lee S (2005) Application and cross-validation of spatial logistic multiple regression for landslide susceptibility analysis. Geosci J 9:63. https://doi.org/10.1007/BF02910555
Li W, Huang R, Tang C, Xu Q, van Westen C (2013) Co-seismic landslide inventory and susceptibility mapping in the 2008 Wenchuan earthquake disaster area China. J Mt Sci 10:339–354. https://doi.org/10.1007/s11629-013-2471-5
Mandal S, Mandal K (2018) Modeling and mapping landslide susceptibility zones using GIS based multivariate binary logistic regression (LR) model in the Rorachu river basin of eastern Sikkim Himalaya India. Model Earth Syst Environ 4:69–88. https://doi.org/10.1007/s40808-018-0426-0
Mersha T, Meten M (2020) GIS-based landslide susceptibility mapping and assessment using bivariate statistical methods in Simada area northwestern Ethiopia. Geoenvironmental Disasters 7:1–22. https://doi.org/10.1186/s40677-020-00155-x
Nguyen VB, Kim Y (2020) Rainfall-earthquake-induced landslide hazard prediction by Monte Carlo simulation: a case study of Mt Umyeon in Korea. KSCE J Civ Eng 24:73–86. https://doi.org/10.1007/s12205-020-0963-8
Pourghasemi HR, Pradhan B, Gokceoglu C (2012) Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed Iran. Nat Hazards 63:965–996. https://doi.org/10.1007/s11069-012-0217-2
Pradhan B (2013) A comparative study on the predictive ability of the decision tree support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–365. https://doi.org/10.1016/j.cageo.2012.08.023
Pradhan B, Lee S (2010a) Landslide susceptibility assessment and factor effect analysis: backpropagation artificial neural networks and their comparison with frequency ratio and bivariate logistic regression modelling. Environ Model Softw 25:747–759. https://doi.org/10.1016/j.envsoft.2009.10.016
Pradhan B, Lee S (2010b) Regional landslide susceptibility analysis using back-propagation neural network model at Cameron Highland Malaysia. Landslides 7:13–30. https://doi.org/10.1007/s10346-009-0183-2
Raja NB, Çiçek I, Türkoğlu N, Aydin O, Kawasaki A (2017) Landslide susceptibility mapping of the Sera River Basin using logistic regression model. Nat Hazards 85:1323–1346. https://doi.org/10.1007/s11069-016-2591-7
Shao X, Ma S, Xu C, Zhou Q (2020) Effects of sampling intensity and non-slide/slide sample ratio on the occurrence probability of coseismic landslides. Geomorphology 363:107222. https://doi.org/10.1016/j.geomorph.2020.107222
Shirzadi A, Solaimani K, Roshan MH, Kavian A, Chapi K, Shahabi H, Keesstra S, Ahmad BB, Bui DT (2019) Uncertainties of prediction accuracy in shallow landslide modeling: sample size and raster resolution. CATENA 178:172–188. https://doi.org/10.1016/j.catena.2019.03.017
Shrestha S, Kang T, Choi JC (2018) Assessment of co-seismic landslide susceptibility using LR and ANCOVA in Barpak region Nepal. J Earth Syst Sci 127:1–17. https://doi.org/10.1007/s12040-018-0936-1
Sun D, Wen H, Wang D, Xu J (2020a) A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm. Geomorphology 362:107201. https://doi.org/10.1016/j.geomorph.2020.107201
Sun D, Wen H, Zhang Y, Xue M (2020b) An optimal sample selection-based logistic regression model of slope physical resistance against rainfall-induced landslide. Nat Hazards 105:1255–1279. https://doi.org/10.1007/s11069-020-04353-6
Sun D, Xu J, Wen H, Wang Y (2020c) An optimized random forest model and its generalization ability in landslide susceptibility mapping: application in two areas of Three Gorges Reservoir China. J Earth Sci 31:1068–1086. https://doi.org/10.1007/s12583-020-1072-9
Tang R, Yan E, Wen T, Yin X, Tang W (2021) Comparison of logistic regression, information value and comprehensive evaluating model for landslide susceptibility mapping. Sustainability 13:3803. https://doi.org/10.3390/su13073803
Tasoglu IK, Keskin Citiroglu H, Mekik C (2016) GIS-based landslide susceptibility assessment: a case study in Kelemen Valley (Yenice-Karabuk NW Turkey). Environ Earth Sci 75:1–15. https://doi.org/10.1007/s12665-016-6098-z
van Westen CJ, Castellanos E, Kuriakose SL (2008) Spatial data for landslide susceptibility hazard and vulnerability assessment: an overview. Eng Geol 102:112–131. https://doi.org/10.1016/j.enggeo.2008.03.010
Wang L, Guo M, Sawada K, Lin J, Zhang J (2016) A comparative study of landslide susceptibility maps using logistic regression frequency ratio decision tree weights of evidence and artificial neural network. Geosci J 20:117–136. https://doi.org/10.1007/s12303-015-0026-1
Wang Y, Fang Z, Hong H (2019) Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County China. Sci Total Environ 666:975–993
Xie W, Li X, Jian W, Yang Y, Liu H, Robledo LF, Nie W (2021) A novel hybrid method for landslide susceptibility mapping-based GeoDetector and machine learning cluster: a case of Xiaojin County China. ISPRS Int J Geoinf 10:93. https://doi.org/10.3390/ijgi10020093
Xiong J, Tang C, Chen M, Zhang X, Shi Q, Gong L (2020) Activity characteristics and enlightenment of the debris flow triggered by the rainstorm on 20 August 2019 in Wenchuan County China. Bull Eng Geol Environ 80(2):873–888. https://doi.org/10.1007/s10064-020-01981-x
Xu C, Xu X, Dai F, Wu Z, He H, Shi F, Wu X, Xu S (2013) Application of an incomplete landslide inventory logistic regression model and its validation for landslide susceptibility mapping related to the May 12 2008 Wenchuan earthquake of China. Nat Hazards 68:883–900. https://doi.org/10.1007/s11069-013-0661-7
Yang J, Song C, Yang Y, Xu C, Guo F, Xie L (2019) New method for landslide susceptibility mapping supported by spatial logistic regression and GeoDetector: a case study of Duwen Highway Basin Sichuan Province China. Geomorphology 324:62–71. https://doi.org/10.1016/j.geomorph.2018.09.019
Yao X, Tham LG, Dai FC (2008) Landslide susceptibility mapping based on support vector machine: a case study on natural slopes of Hong Kong China. Geomorphology 101:572–582. https://doi.org/10.1016/j.geomorph.2008.02.011
Yin G, Luo J, Niu F, Lin Z, Liu M (2021) Machine learning-based thermokarst landslide susceptibility modeling across the permafrost region on the Qinghai-Tibet Plateau. Landslides 18:2639–2649. https://doi.org/10.1007/s10346-021-01669-7
Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM (2016) Landslide susceptibility mapping using random forest boosted regression tree classification and regression tree and general linear models and comparison of their performance at Wadi Tayyah Basin Asir Region Saudi Arabia. Landslides 13:839–856. https://doi.org/10.1007/s10346-015-0614-1
Zhang Z, Yang F, Chen H, Wu Y, Li T, Li W, Wang Q, Liu P (2016) GIS-based landslide susceptibility analysis using frequency ratio and evidential belief function models. Environ Earth Sci 75:1–12. https://doi.org/10.1007/s12665-016-5732-0
Zhu A, Miao Y, Yang L, Bai S, Liu J, Hong H (2018) Comparison of the presence-only method and presence-absence method in landslide susceptibility mapping. CATENA 171:222–233. https://doi.org/10.1016/j.catena.2018.07.012
Acknowledgements
We want to express our gratitude to Chongqing Meteorological Administration for providing essential meteorological data and also to Chongqing Institute of Geology and Mineral Resources for offering valuable research data on historical landslides. Special thanks are given to the editors and anonymous reviewers for their constructive and professional comments reviews of earlier versions of the paper.
Funding
This research was funded by the National Key Research and Development Program of China (Grant Nos. 2018YFC1505501 and 2018YFC1505504).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, Y., Wen, H., Xie, P. et al. Hybrid-optimized logistic regression model of landslide susceptibility along mountain highway. Bull Eng Geol Environ 80, 7385–7401 (2021). https://doi.org/10.1007/s10064-021-02415-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10064-021-02415-y