Abstract
The current study aims to apply and compare the performance of six machine learning algorithms, including three basic classifiers: random forest (RF), gradient boosting decision tree (GBDT), and extreme gradient boosting (XGB), as well as their hybrid classifiers, using the logistic regression (LR) method (RF + LR, GBDT + LR, and XGB + LR), to map the landslide susceptibility of Zhangjiajie City, Hunan Province, China. First, a landslide inventory map was created with 206 historical landslide points and 412 non-landslide points, which was randomly divided into two datasets for model training (80%) and model testing (20%). Second, a landslide factor database was initially established by selecting 15 landslide conditioning factors from the topography, hydrology, climate, geology, and artificial activities. Thereafter, the multicollinearity test and information gain ratio (IGR) technique were applied to rank the importance of the factors. Subsequently, we used a series of metrics (e.g., accuracy, precision, recall, f-measure, area under the ROC (receiver operating characteristic) curve (AUC), kappa index, mean absolute error (MAE), and root mean square error (RMSE)) to evaluate the accuracy and performance of the six models. Based on the AUC values derived from the models, the GBDT + LR model with the highest AUC value (0.8168) was identified as the most efficient model for mapping landslide susceptibility, followed by the XGB + LR, XGB, RF + LR, GBDT, and RF models, which achieved AUC values of 0.8124, 0.8118, 0.8060, 0.7927, and 0.7883, respectively. The results from this study suggest that the stacking ensemble machine learning method is promising for use in landslide susceptibility mapping in the Zhangjiajie area and is capable of targeting the areas prone to landslides.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available on request from the corresponding author [Baoyi Zhang].
References
Abedini M, Tulabi S (2018) Assessing LNRF, FR, and AHP models in landslide susceptibility mapping index: a comparative study of Nojian watershed in Lorestan province, Iran. Environ Earth Sci 77(11):1–13
Aghdam IN, Pradhan B, Panahi M (2017) Landslide susceptibility assessment using a novel hybrid model of statistical bivariate methods (FR and WOE) and adaptive neuro-fuzzy inference system (ANFIS) at southern Zagros Mountains in Iran. Environ Earth Sci 76(6):237
Althuwaynee OF, Pradhan B, Park HJ et al (2014) A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping. Landslides 11(6):1063–1078
Arabameri A, Pradhan B, Rezaei K et al (2019) GIS-based landslide susceptibility mapping using numerical risk factor bivariate model and its ensemble with linear multivariate regression and boosted regression tree algorithms. J Mt Sci 16(3):595–618
Arulbalaji P, Padmalal D, Sreelash K (2019) GIS and AHP techniques based delineation of groundwater potential zones: a case study from southern Western Ghats, India. Sci Rep 9(1):1–17
Band SS, Janizadeh S, Chandra Pal S et al (2020) Novel ensemble approach of deep learning neural network (DLNN) model and particle swarm optimization (PSO) algorithm for prediction of gully erosion susceptibility. Sensors 20(19):5609
Botzen WJW, Aerts JCJH, Van Den Bergh JCJM (2013) Individual preferences for reducing flood risk to near zero through elevation. Mitig Adapt Strateg Glob Change 18(2):229–244
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Bui DT, Shahabi H, Omidvar E et al (2019) Shallow landslide prediction using a novel hybrid functional machine learning algorithm. Remote Sens 11(8):931
Chen W, Zhang S (2021) GIS-based comparative study of Bayes network, Hoeffding tree and logistic model tree for landslide susceptibility modeling. CATENA 203:105344
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. pp 785–794
Chen T, Zhu L, Niu RQ et al (2020a) Mapping landslide susceptibility at the Three Gorges Reservoir, China, using gradient boosting decision tree, random forest and information value models. J Mt Sci 17(3):670–685
Chen W, Chen Y, Tsangaratos P et al (2020b) Combining evolutionary algorithms and machine learning models in landslide susceptibility assessments. Remote Sen 12(23):3854
Cruden DM, Varnes DJ (1996) Landslide types and processes. Spec Rep Natl Res Council Transport Res Board 247:36–75
Demir G (2019) GIS-based landslide susceptibility mapping for a part of the North Anatolian Fault Zone between Reşadiye and Koyulhisar (Turkey). CATENA 183:104211
Dou J, Yunus AP, Bui DT et al (2020) Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan. Landslides 17(3):641–658
Farrar DE, Glauber RR (1967) Multicollinearity in regression analysis: the problem revisited. The review of economic and statistics, p 92–107
Freund Y (1990) Boosting a weak learning algorithm by majority. Inf Comput 121(2):256–285
Gigović L, Drobnjak S, Pamučar D (2019) The application of the hybrid GIS spatial multi-criteria decision analysis best–worst methodology for landslide susceptibility mapping. ISPRS Int J Geo Inf 8(2):79
Green IRA, Stephenson D (1986) Criteria for comparison of single event models. Hydrol Sci J 31(3):395–411
Guzzetti F, Reichenbach P, Cardinali M et al (2005) Probabilistic landslide hazard assessment at the basin scale. Geomorphology 72(1–4):272–299
Guzzetti F, Galli M, Reichenbach P et al (2006) Landslide hazard assessment in the Collazzone area, Umbria, Central Italy. Nat Hazards Earth Syst Sci 6(1):115–131
He X, Pan J, Jin O et al (2014) Practical lessons from predicting clicks on ads at facebook. In: Proceedings of the eighth international workshop on data mining for online advertising. p 1–9
Hong H, Naghibi SA, Pourghasemi HR et al (2016) GIS-based landslide spatial modeling in Ganzhou City, China. Arab J Geosci 9(2):112
Huang FM, Chen JW, Du Z et al (2020) Landslide susceptibility prediction considering regional soil erosion based on machine-learning models. ISPRS Int J Geo Inf 9(6):377
Hunan Bureau of Geology and Mineral Resources (1988) Hunan regional geology. Geological Publishing House, Beijing
Hungr O, Leroueil S, Picarelli L (2014) The Varnes classification of landslide types, an update. Landslides 11(2):167–194
Jerome HF (2001) Greedy function approximation: A gradient boosting machine. Ann Stat 29(5):1189–1232
Kadavi PR, Lee CW, Lee S (2019) Landslide-susceptibility mapping in Gangwon-do, South Korea, using logistic regression and decision tree models. Environ Earth Sci 78(4):116
Kalantar B, Pradhan B, Naghibi SA et al (2018) Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomat Nat Hazards Risk 9(1):49–69
Lai JS, Tsai F (2019) Improving GIS-based landslide susceptibility assessments with multi-temporal remote sensing and machine learning. Sensors 19(17):3717
Lin W, Yin KL, Wang NT et al (2021) Landslide hazard assessment of rainfall-induced landslide based on the CF-SINMAP model: a case study from Wuling Mountain in Hunan Province, China. Nat Hazards 106(1):679–700
Ma B, Gao R, Zhao B et al (2015) Disease prediction based on LR-RF method. Basic Clin Pharmacol Toxicol 117:14
Merghadi A, Yunus AP, Dou J et al (2020) Machine learning methods for landslide susceptibility studies: a comparative overview of algorithm performance. Earth Sci Rev 207:103225
Mondal S, Mandal S (2020) Data-driven evidential belief function (EBF) model in exploring landslide susceptibility zones for the Darjeeling Himalaya, India. Geocarto Int 35(8):818–856
Nahayo L, Mupenzi C, Habiyaremye G et al (2019) Landslides hazard mapping in Rwanda using bivariate statistical index method. Environ Eng Sci 36(8):892–902
Nefeslioglu HA, Duman TY, Durmaz S (2008) Landslide susceptibility mapping for a part of tectonic Kelkit Valley (Eastern Black Sea region of Turkey). Geomorphology 94(3/4):401–418
Nguyen VT, Tran TH, Ha NA et al (2019) GIS based novel hybrid computational intelligence models for mapping landslide susceptibility: a case study at Da Lat city, Vietnam. Sustainability 11(24):7118
Nhu VH, Janizadeh S, Avand M et al (2020a) GIS-based gully erosion susceptibility mapping: a comparison of computational ensemble data mining models. Appl Sci 10(6):2039
Nhu VH, Zandi D, Shahabi H et al (2020b) Comparison of support vector machine, Bayesian logistic regression, and alternating decision tree algorithms for shallow landslide susceptibility mapping along a mountainous road in the west of Iran. Appl Sci 10(15):5047
Petley D (2012) Global patterns of loss of life from landslides. Geology 40(10):927–930
Pham BT, Bui DT, Prakash I et al (2015) Landslide susceptibility assessment at a part of Uttarakhand Himalaya, India using GIS-based statistical approach of frequency ratio method. Int J Eng Tech Res V4(11):338–344
Pham BT, Bui DT, Prakash I et al (2017) Hybrid integration of Multilayer perceptron neural networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. CATENA 149:52–63
Pham BT, Bui DT, Prakash I (2018) Landslide susceptibility modelling using different advanced decision trees methods. Civ Eng Environ Syst 35(1–4):139–157
Pham BT, Phong VT, Nguyen-Thoi T et al (2020) GIS-based ensemble soft computing models for landslide susceptibility mapping. Adv Space Res 66(6):1303–1320
Sadighi M, Motamedvaziri B, Ahmadi H et al (2020) Assessing landslide susceptibility using machine learning models: a comparison between ANN, ANFIS, and ANFIS-ICA. Environ Earth Sci 79(24):536
Sahin EK (2020) Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Appl Sci 2(7):1–17
Sur U, Singh P, Meena SR (2020) Landslide susceptibility assessment in a lesser Himalayan road corridor (India) applying fuzzy AHP technique and earth-observation data. Geomat Nat Hazards Risk 11(1):2176–2209
Wang YM, Feng LW, Li SJ et al (2020) A hybrid model considering spatial heterogeneity for landslide susceptibility mapping in Zhejiang Province, China. CATENA 188:104425
Webb GI (2000) Multiboosting: a technique for combining boosting and wagging. Mach Learn 40(2):159–196
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Yang GF, Zhang XJ, Tian MZ et al (2011) Alluvial terrace systems in Zhangjiajie of northwest Hunan, China: Implications for climatic change, tectonic uplift and geomorphic evolution. Quatern Int 233(1):27–39
Zhang BY, Li MY, Li WX et al (2021) Machine learning strategies for lithostratigraphic classification based on geochemical sampling data: a case study in area of Chahanwusu River, Qinghai Province, China. J Central South Univ 28(5):1422–1447
Zhao BB, Ge YF, Chen HZ (2021) Landslide susceptibility assessment for a transmission line in Gansu Province, China by using a hybrid approach of fractal theory, information value, and random forest models. Environ Earth Sci 80(12):441
Zhou CM, Wang Y, Ye HT et al (2021a) Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer. Sci Rep 10(1):1–7
Zhou XZ, Wen HJ, Zhang YL et al (2021b) Landslide susceptibility mapping using hybrid random forest with GeoDetector and RFE for factor optimization. Geosci Front 12(5):101211
Acknowledgements
The authors thank the MapGIS Laboratory Co-Constructed by National Engineering Research Center for Geographic Information System of China and Central South University for providing MapGIS® software (Wuhan Zondy Cyber-Tech Co. Ltd., Wuhan, China). We also thank Mr. Dongliang Huang (Senior Engineer at the Hunan Provincial Planning Institute of Land and Resources) and Dr. Lifang Wang (Engineer at the Hunan Vocational College of Engineering) for providing and processing the dataset.
Funding
This study was supported by grants from the Hunan Provincial Natural Resource Science and Technology Planning Program of China (Grant No. 2021-53), the National Natural Science Foundation of China (Grant Nos. 42072326 and 41772348), and the National Key Research and Development Program of China (Grant No. 2019YFC1805905).
Author information
Authors and Affiliations
Contributions
Conceptualization, YH and BZ; methodology, YH; software, YH and UK; validation, BZ; investigation, LS; data curation, LS; writing—original draft preparation, YH and UK; writing—review and editing, BZ; funding acquisition, BZ. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflicts of interest and competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huan, Y., Song, L., Khan, U. et al. Stacking ensemble of machine learning methods for landslide susceptibility mapping in Zhangjiajie City, Hunan Province, China. Environ Earth Sci 82, 35 (2023). https://doi.org/10.1007/s12665-022-10723-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-022-10723-z