Abstract
Epistemic uncertainty in data-driven landslide susceptibility assessment often tends to be increased by the limited accuracy of an individual model, as well as uncertainties associated with the selection of non-landslide samples. To address these issues, this paper centers on the landslide disaster in Ji’an City, China, and proposes a heterogeneous ensemble learning method incorporating frequency ratio (FR) and semi-supervised sample expansion. Based on the superimposed results of 12 environmental factor frequency ratios (FFR), non-landslide samples were selected and input into light gradient boosting machine (LightGBM), random forest (RF), and convolutional neural network (CNN) models for prediction along with historical landslide samples. The predicted probability values are integrated by four heterogeneous ensemble strategies to expand samples from high-confidence results. The model’s performance is evaluated using the area under the receiver operating characteristic curve (AUC), partition frequency ratio (PFR), and other verification methods. The results demonstrate that the negative sample based on FFR sampling is more accurate than the random sampling method, and the FR-SSELR model based on frequency ratio sampling and semi-supervised ensemble strategy exhibits the highest performance (AUC = 0.971, ACC = 0.941). A more reasonable landslide susceptibility map was drawn based on this model, with the lowest percentage of landslides in the low and very low susceptibility zones (sum of PFR = 0.194), as well as the highest percentage of landslides in the high and very high susceptibility zones (sum of PFR = 6.800). Furthermore, the FR-SSELR model improved economic benefits by 3.82–14.2%, offering valuable guidance for decision-making regarding landslide management and the sustainability of Ji’an City.
Similar content being viewed by others
Data availability
All data were included in this manuscript.
References
Ahmed Z, Hussain AHMB, Ambinakudige S et al (2022) Perceived human-induced causes of landslide in Chattogram Metropolitan Area in Bangladesh. Earth Syst Environ 6:499–515. https://doi.org/10.1007/s41748-022-00304-2
Allouche O, Tsoar A, Kadmon R (2006) Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS): assessing the accuracy of distribution models. J Appl Ecol 43:1223–1232. https://doi.org/10.1111/j.1365-2664.2006.01214.x
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Budimir MEA, Atkinson PM, Lewis HG (2015) A systematic review of landslide probability mapping using logistic regression. Landslides 12:419–436. https://doi.org/10.1007/s10346-014-0550-5
Calderón-Guevara W, Sánchez-Silva M, Nitescu B, Villarraga DF (2022) Comparative review of data-driven landslide susceptibility models: case study in the Eastern Andes mountain range of Colombia. Nat Hazards 113:1105–1132. https://doi.org/10.1007/s11069-022-05339-2
Cantarino I, Carrion MA, Goerlich F, Martinez Ibañez V (2019) A ROC analysis-based classification method for landslide susceptibility maps. Landslides 16:265–282. https://doi.org/10.1007/s10346-018-1063-4
Cao J, Qin S, Yao J et al (2023) Debris flow susceptibility assessment based on information value and machine learning coupling method: from the perspective of sustainable development. Environ Sci Pollut Res 30:87500–87516. https://doi.org/10.1007/s11356-023-28575-w
Chang Z, Du Z, Zhang F et al (2020) Landslide susceptibility prediction based on remote sensing images and GIS: comparisons of supervised and unsupervised machine learning models. Remote Sens 12:502. https://doi.org/10.3390/rs12030502
Chang Z, Huang J, Huang F et al (2023) Uncertainty analysis of non-landslide sample selection in landslide susceptibility prediction using slope unit-based machine learning models. Gondwana Res 117:307–320. https://doi.org/10.1016/j.gr.2023.02.007
Chen W, Pourghasemi HR, Kornejady A, Zhang N (2017) Landslide spatial modeling: introducing new ensembles of ANN, MaxEnt, and SVM machine learning techniques. Geoderma 305:314–327. https://doi.org/10.1016/j.geoderma.2017.06.020
Demir G, Aytekin M, Akgün A et al (2013) A comparison of landslide susceptibility mapping of the eastern part of the North Anatolian Fault Zone (Turkey) by likelihood-frequency ratio and analytic hierarchy process methods. Nat Hazards 65:1481–1506. https://doi.org/10.1007/s11069-012-0418-8
Deng H, Zhong CY, Wu L, Tu GX (2021) Process analysis of causes of Luanshigang landslide in the Dadu River, China. Environ Earth Sci 80:737. https://doi.org/10.1007/s12665-021-10069-y
Di Napoli M, Carotenuto F, Cevasco A et al (2020) Machine learning ensemble modelling as a tool to improve landslide susceptibility mapping reliability. Landslides 17:1897–1914. https://doi.org/10.1007/s10346-020-01392-9
Dou J, Yunus AP, Tien Bui D et al (2019) Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan. Sci Total Environ 662:332–346. https://doi.org/10.1016/j.scitotenv.2019.01.221
Fang Z, Wang Y, Peng L, Hong H (2021) A comparative study of heterogeneous ensemble-learning techniques for landslide susceptibility mapping. Int J Geogr Inf Sci 35:321–347. https://doi.org/10.1080/13658816.2020.1808897
Foody GM (2009) Classification accuracy comparison: Hypothesis tests and the use of confidence intervals in evaluations of difference, equivalence and non-inferiority. Remote Sens Environ 113:1658–1663. https://doi.org/10.1016/j.rse.2009.03.014
Frattini P, Crosta G, Carrara A (2010) Techniques for evaluating the performance of landslide susceptibility models. Eng Geol 111:62–72. https://doi.org/10.1016/j.enggeo.2009.12.004
Guo D, Hamada M (2013) Qualitative and quantitative analysis on landslide influential factors during Wenchuan earthquake: a case study in Wenchuan County. Eng Geol 152:202–209. https://doi.org/10.1016/j.enggeo.2012.10.012
Hacıefendioğlu K, Demir G, Başağa HB (2021) Landslide detection using visualization techniques for deep convolutional neural network models. Nat Hazards 109:329–350. https://doi.org/10.1007/s11069-021-04838-y
Huan Y, Song L, Khan U, Zhang B (2023) Stacking ensemble of machine learning methods for landslide susceptibility mapping in Zhangjiajie City, Hunan Province, China. Environ Earth Sci 82:35. https://doi.org/10.1007/s12665-022-10723-z
Huang F, Cao Z, Jiang S-H et al (2020a) Landslide susceptibility prediction based on a semi-supervised multiple-layer perceptron model. Landslides 17:2919–2930. https://doi.org/10.1007/s10346-020-01473-9
Huang F, Li R, Catani F et al (2024a) Uncertainties in landslide susceptibility prediction: influence rule of different levels of errors in landslide spatial position. J Rock Mech Geotech Eng:S1674775524000714. https://doi.org/10.1016/j.jrmge.2024.02.001
Huang F, Teng Z, Yao C et al (2024b) Uncertainties of landslide susceptibility prediction: influences of random errors in landslide conditioning factors and errors reduction by low pass filter method. J Rock Mech Geotech Eng 16:213–230. https://doi.org/10.1016/j.jrmge.2023.11.001
Huang F, Xiong H, Jiang S-H et al (2024c) Modelling landslide susceptibility prediction: a review and construction of semi-supervised imbalanced theory. Earth-Sci Rev 250:104700. https://doi.org/10.1016/j.earscirev.2024.104700
Huang F, Xiong H, Yao C et al (2023) Uncertainties of landslide susceptibility prediction considering different landslide types. J Rock Mech Geotech Eng:S1674775523000823. https://doi.org/10.1016/j.jrmge.2023.03.001
Huang F, Yao C, Liu W et al (2018) Landslide susceptibility assessment in the Nantian area of China: a comparison of frequency ratio model and support vector machine. Geomat Nat Hazards Risk 9:919–938. https://doi.org/10.1080/19475705.2018.1482963
Huang F, Zhang J, Zhou C et al (2020b) A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 17:217–229. https://doi.org/10.1007/s10346-019-01274-9
Huang Y, Zhao L (2018) Review on landslide susceptibility mapping using support vector machines. Catena 165:520–529. https://doi.org/10.1016/j.catena.2018.03.003
Kavzoglu T, Sahin EK, Colkesen I (2014) Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression. Landslides 11:425–439. https://doi.org/10.1007/s10346-013-0391-7
Ke G, Meng Q, Finley T et al (2017) Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inform Process Syst 30
Khan H, Shafique M, Khan MA et al (2019) Landslide susceptibility assessment using Frequency Ratio, a case study of northern Pakistan. Egypt J Remote Sens Space Sci 22:11–24. https://doi.org/10.1016/j.ejrs.2018.03.004
Lee S, Pradhan B (2007) Landslide hazard mapping at Selangor, Malaysia using frequency ratio and logistic regression models. Landslides 4:33–41. https://doi.org/10.1007/s10346-006-0047-y
Li F, Torgoev I, Zaredinov D et al (2021) Landslide-induced mass transport of radionuclides along transboundary Mailuu-Suu River networks in Central Asia. Remote Sens 13:698. https://doi.org/10.3390/rs13040698
Li K, Zhao J, Lin Y (2023) Debris-flow susceptibility assessment in Dongchuan using stacking ensemble learning including multiple heterogeneous learners with RFE for factor optimization. Nat Hazards 118:2477–2511. https://doi.org/10.1007/s11069-023-06099-3
Li L, Lan H, Guo C et al (2017) A modified frequency ratio method for landslide susceptibility assessment. Landslides 14:727–741. https://doi.org/10.1007/s10346-016-0771-x
Liu Q, Tang A, Huang D (2023) Exploring the uncertainty of landslide susceptibility assessment caused by the number of non–landslides. Catena 227:107109. https://doi.org/10.1016/j.catena.2023.107109
Lobo JM, Jiménez-Valverde A, Real R (2008) AUC: a misleading measure of the performance of predictive distribution models. Glob Ecol Biogeogr 17:145–151. https://doi.org/10.1111/j.1466-8238.2007.00358.x
Lombardo L, Mai PM (2018) Presenting logistic regression-based landslide susceptibility results. Eng Geol 244:14–24. https://doi.org/10.1016/j.enggeo.2018.07.019
Matloob F, Ghazal TM, Taleb N et al (2021) Software defect prediction using ensemble learning: A systematic literature review. IEEE Access 9:98754–98771. https://doi.org/10.1109/ACCESS.2021.3095559
Miao F, Wu Y, Xie Y, Li Y (2018) Prediction of landslide displacement with step-like behavior based on multialgorithm optimization and a support vector regression model. Landslides 15:475–488. https://doi.org/10.1007/s10346-017-0883-y
Nandi A, Shakoor A (2010) A GIS-based landslide susceptibility evaluation using bivariate and multivariate statistical analyses. Eng Geol 110:11–20. https://doi.org/10.1016/j.enggeo.2009.10.001
Nicu IC (2018) Application of analytic hierarchy process, frequency ratio, and statistical index to landslide susceptibility: an approach to endangered cultural heritage. Environ Earth Sci 77:79. https://doi.org/10.1007/s12665-018-7261-5
Nti IK, Adekoya AF, Weyori BA (2020) A comprehensive evaluation of ensemble learning for stock-market prediction. J Big Data 7:20. https://doi.org/10.1186/s40537-020-00299-5
Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343. https://doi.org/10.1016/S0013-7952(03)00069-3
Peng L, Niu R, Huang B et al (2014) Landslide susceptibility mapping based on rough set theory and support vector machines: a case of the Three Gorges area, China. Geomorphology 204:287–301. https://doi.org/10.1016/j.geomorph.2013.08.013
Popescu M (2001) A suggested method for reporting landslide remedial measures. Bull Eng Geol Environ 60:69–74. https://doi.org/10.1007/s100640000084
Pradhan B (2023) Spatial flood susceptibility mapping using an explainable artificial intelligence (XAI) model. Geosci Front 14(6):101625
Rasyid AR, Bhandary NP, Yatabe R (2016) Performance of frequency ratio and logistic regression model in creating GIS based landslides susceptibility map at Lompobattang Mountain, Indonesia. Geoenvironmental Disasters 3:19. https://doi.org/10.1186/s40677-016-0053-x
Reichenbach P, Rossi M, Malamud BD et al (2018) A review of statistically-based landslide susceptibility models. Earth-Sci Rev 180:60–91. https://doi.org/10.1016/j.earscirev.2018.03.001
Shafizadeh-Moghadam H (2018) Novel forecasting approaches using combination of machine learning and statistical models for flood susceptibility mapping. J Environ Manage 217:1
Shano L, Raghuvanshi TK, Meten M (2020) Landslide susceptibility evaluation and hazard zonation techniques – a review. Geoenviron Disas 7:18. https://doi.org/10.1186/s40677-020-00152-0
Silalahi FES, Pamela AY, Hidayat F (2019) Landslide susceptibility assessment using frequency ratio model in Bogor, West Java, Indonesia. Geosci Lett 6:10. https://doi.org/10.1186/s40562-019-0140-4
Song H, Cui W (2016) A large-scale colluvial landslide caused by multiple factors: mechanism analysis and phased stabilization. Landslides 13:321–335. https://doi.org/10.1007/s10346-015-0560-y
Stumpf A, Kerle N (2011) Object-oriented mapping of landslides using random forests. Remote Sens Environ 115:2564–2577. https://doi.org/10.1016/j.rse.2011.05.013
Sujatha ER, Sudarsan JS, Nithiyanantham S (2023) A review on sustainable reinforcing techniques to stabilize slopes against landslides. Int J Environ Sci Technol. https://doi.org/10.1007/s13762-023-04832-w
Sun D, Wu X, Wen H, Gu Q (2023) A LightGBM-based landslide susceptibility model considering the uncertainty of non-landslide samples. Geomat Nat Hazards Risk 14:2213807. https://doi.org/10.1080/19475705.2023.2213807
Sun M, Song Z, Jiang X et al (2017) Learning pooling for convolutional neural network. Neurocomputing 224:96–104. https://doi.org/10.1016/j.neucom.2016.10.049
Taalab K, Cheng T, Zhang Y (2018) Mapping landslide susceptibility and types using random forest. Big Earth Data 2:159–178. https://doi.org/10.1080/20964471.2018.1472392
Tsangaratos P, Ilia I (2016) Landslide susceptibility mapping using a modified decision tree classifier in the Xanthi Perfection, Greece. Landslides 13:305–320. https://doi.org/10.1007/s10346-015-0565-6
Wang X, Zhang C, Wang C et al (2021) GIS-based for prediction and prevention of environmental geological disaster susceptibility: from a perspective of sustainable development. Ecotoxicol Environ Saf 226:112881. https://doi.org/10.1016/j.ecoenv.2021.112881
Wang Y, Fang Z, Hong H (2019) Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China. Sci Total Environ 666:975–993. https://doi.org/10.1016/j.scitotenv.2019.02.263
Wu H, Prasad S (2018) Semi-supervised deep learning using pseudo labels for hyperspectral image classification. IEEE Trans Image Process 27:1259–1270. https://doi.org/10.1109/TIP.2017.2772836
Xiong L (2022) Study on movement characteristics of rockfall considering shape effect: a case study of rockfall at Guanmenla. J Prog Civ Eng 4. https://doi.org/10.53469/jpce.2022.04(11).08
Yang J, Song C, Yang Y et al (2019) New method for landslide susceptibility mapping supported by spatial logistic regression and GeoDetector: a case study of Duwen Highway Basin, Sichuan Province, China. Geomorphology 324:62–71. https://doi.org/10.1016/j.geomorph.2018.09.019
Yang N, Wang R, Liu Z, Yao Z (2023) Landslide susceptibility prediction improvements based on a semi-integrated supervised machine learning model. Environ Sci Pollut Res 30:50280–50294. https://doi.org/10.1007/s11356-023-25650-0
Yao J, Qin S, Qiao S et al (2020) Assessment of landslide susceptibility combining deep learning with semi-supervised learning in Jiaohe County, Jilin Province, China. Appl Sci 10:5640. https://doi.org/10.3390/app10165640
Yao J, Zhang X, Luo W et al (2022) Applications of stacking/blending ensemble learning approaches for evaluating flash flood susceptibility. Int J Appl Earth Obs Geoinform 112:102932. https://doi.org/10.1016/j.jag.2022.102932
Yeon Y-K, Han J-G, Ryu KH (2010) Landslide susceptibility mapping in Injae, Korea, using a decision tree. Eng Geol 116:274–283. https://doi.org/10.1016/j.enggeo.2010.09.009
Yi Y, Zhang Z, Zhang W et al (2020) Landslide susceptibility mapping using multiscale sampling strategy and convolutional neural network: a case study in Jiuzhaigou region. Catena 195:104851. https://doi.org/10.1016/j.catena.2020.104851
Yoshihara N, Matsumoto S, Umezawa R, Machida I (2022) Catchment-scale impacts of shallow landslides on stream water chemistry. Sci Total Environ 825:153970. https://doi.org/10.1016/j.scitotenv.2022.153970
Zeng T, Wu L, Peduto D et al (2023) Ensemble learning framework for landslide susceptibility mapping: different basic classifier and ensemble strategy. Geosci Front 14:101645. https://doi.org/10.1016/j.gsf.2023.101645
Zhang S, Wang Y, Wu G (2022) Earthquake-induced landslide susceptibility assessment using a novel model based on gradient boosting machine learning and class balancing methods. Remote Sens 14:5945. https://doi.org/10.3390/rs14235945
Zhou X (2021) Landslide susceptibility mapping using hybrid random forest with GeoDetector and RFE for factor optimization. Geosci Front 12(5):101211
Ziegler A, König IR (2014) Mining data with random forests: current options for real-world applications. WIREs Data Min Knowl Discov 4:55–63. https://doi.org/10.1002/widm.1114
Funding
This research was supported by the National Natural Science Foundation of China (grant nos. 41977221 and 41972267).
Author information
Authors and Affiliations
Contributions
Yangyang Zhao: conceptualization, methodology, software, data curation, writing—original draft and editing; Shengwu Qin: project administration, investigation, writing—review and editing, funding acquisition; Chaobiao Zhang: resources, investigation, writing—review and editing; Jingyu Yao: conceptualization, methodology, writing—review and editing; Ziyang Xing: software, writing—review and editing; Jiasheng Cao: visualization, supervision; Renchao Zhang: software, validation, supervision
Corresponding author
Ethics declarations
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Conflict of interest
The authors declare no competing interests.
Additional information
Responsible Editor: Philippe Garrigues
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhao, Y., Qin, S., Zhang, C. et al. Landslide susceptibility assessment based on frequency ratio and semi-supervised heterogeneous ensemble learning model. Environ Sci Pollut Res 31, 32043–32059 (2024). https://doi.org/10.1007/s11356-024-33287-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-024-33287-w