Spatial prediction of landslide susceptibility using data mining-based kernel logistic regression, naive Bayes and RBFNetwork models for the Long County area (China)

  • Wei ChenEmail author
  • Xusheng Yan
  • Zhou Zhao
  • Haoyuan HongEmail author
  • Dieu Tien Bui
  • Biswajeet Pradhan
Original Paper


The main goal of this study is to assess and compare three advanced machine learning techniques, namely, kernel logistic regression (KLR), naïve Bayes (NB), and radial basis function network (RBFNetwork) models for landslide susceptibility modeling in Long County, China. First, a total of 171 landslide locations were identified within the study area using historical reports, aerial photographs, and extensive field surveys. All the landslides were randomly separated into two parts with a ratio of 70/30 for training and validation purposes. Second, 12 landslide conditioning factors were prepared for landslide susceptibility modeling, including slope aspect, slope angle, plan curvature, profile curvature, elevation, distance to faults, distance to rivers, distance to roads, lithology, NDVI (normalized difference vegetation index), land use, and rainfall. Third, the correlations between the conditioning factors and the occurrence of landslides were analyzed using normalized frequency ratios. A multicollinearity analysis of the landslide conditioning factors was carried out using tolerances and variance inflation factor (VIF) methods. Feature selection was performed using the chi-squared statistic with a 10-fold cross-validation technique to assess the predictive capabilities of the landslide conditioning factors. Then, the landslide conditioning factors with null predictive ability were excluded in order to optimize the landslide models. Finally, the trained KLR, NB, and RBFNetwork models were used to construct landslide susceptibility maps. The receiver operating characteristics (ROC) curve, the area under the curve (AUC), and several statistical measures, such as accuracy (ACC), F-measure, mean absolute error (MAE), and root mean squared error (RMSE), were used for the assessment, validation, and comparison of the resulting models in order to choose the best model in this study. The validation results show that all three models exhibit reasonably good performance, and the KLR model exhibits the most stable and best performance. The KLR model, which has a success rate of 0.847 and a prediction rate of 0.749, is a promising technique for landslide susceptibility mapping. Given the outcomes of the study, all three models could be used efficiently for landslide susceptibility analysis.


Landslide Kernel logistic regression Naive Bayes RBF network China 



The authors would like to express their gratitude to the Editor-in-Chief Martin Gordon Culshaw and two anonymous reviewers for their helpful comments on the manuscript.


This research was supported by Project funded by China Postdoctoral Science Foundation (Grant No. 2017 M613168), Project funded by Shaanxi Province Postdoctoral Science Foundation (Grant No. 2017BSHYDZZ07), Scientific Research Program Funded by Shaanxi Provincial Education Department (Program No. 17JK0511), the Open Fund of Shandong Provincial Key Laboratory of Depositional Mineralization & Sedimentary Minerals (Grant No. DMSM2017029), and the Open-ended Fund of the Key Laboratory for Geo-hazard in Loess Areas, Ministry of Land and Resources of China (Grant No. KLGLAMLR201603).


  1. Aghdam IN, Varzandeh MHM, Pradhan B (2016) Landslide susceptibility mapping using an ensemble statistical index (Wi) and adaptive neuro-fuzzy inference system (ANFIS) model at Alborz Mountains (Iran). Environ Earth Sci 75:1–20CrossRefGoogle Scholar
  2. Althuwaynee OF, Pradhan B, Lee S (2016) A novel integrated model for assessing landslide susceptibility mapping using CHAID and AHP pair-wise comparison. Int J Remote Sens 37:1190–1209CrossRefGoogle Scholar
  3. Andrews DW (1988) Chi-square diagnostic tests for econometric models: introduction and applications. J Econ 37:135–156CrossRefGoogle Scholar
  4. Booth AM et al (2015) Integrating diverse geologic and geodetic observations to determine failure mechanisms and deformation rates across a large bedrock landslide complex: the Osmundneset landslide, Sogn og Fjordane, Norway. Landslides 12:745–756CrossRefGoogle Scholar
  5. Carlini M et al (2016) Tectonic control on the development and distribution of large landslides in the northern Apennines (Italy). Geomorphology 253:425–437CrossRefGoogle Scholar
  6. Cawley GC, Talbot NL (2008) Efficient approximate leave-one-out cross-validation for kernel logistic regression. Mach Learn 71:243–264CrossRefGoogle Scholar
  7. Chen W, Panahi M, Pourghasemi HR (2017a) Performance evaluation of GIS-based new ensemble data mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA), differential evolution (DE), and particle swarm optimization (PSO) for landslide spatial modelling. Catena 157:310–324CrossRefGoogle Scholar
  8. Chen W et al (2017b) GIS-based landslide susceptibility modelling: a comparative assessment of kernel logistic regression, Naïve-Bayes tree, and alternating decision tree models. Geomat Nat Haz Risk 8:950–973CrossRefGoogle Scholar
  9. Chen W et al (2017c) A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. Catena 151:147–160CrossRefGoogle Scholar
  10. Chung C-JF, Fabbri AG (2003) Validation of spatial prediction models for landslide hazard mapping. Nat Hazards 30:451–472CrossRefGoogle Scholar
  11. Colkesen I, Sahin EK, Kavzoglu T (2016) Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression. J Afr Earth Sci 118:53–64CrossRefGoogle Scholar
  12. Conoscenti C et al (2015) Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: a case of the Belice River basin (western Sicily, Italy). Geomorphology 242:49–64CrossRefGoogle Scholar
  13. Constantin M, Bednarik M, Jurchescu MC, Vlaicu M (2011) Landslide susceptibility assessment using the bivariate statistical analysis and the index of entropy in the Sibiciu Basin (Romania). Environ Earth Sci 63:397–406CrossRefGoogle Scholar
  14. Cook TL, Yellen BC, Woodruff JD, Miller D (2015) Contrasting human versus climatic impacts on erosion. Geophys Res Lett 42:6680–6687CrossRefGoogle Scholar
  15. Dehnavi A, Aghdam IN, Pradhan B, Varzandeh MHM (2015) A new hybrid model using step-wise weight assessment ratio analysis (SWAM) technique and adaptive neuro-fuzzy inference system (ANFIS) for regional landslide hazard assessment in Iran. Catena 135:122–148CrossRefGoogle Scholar
  16. Dormann CF et al (2013) Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36:27–46CrossRefGoogle Scholar
  17. Dou J et al. (2014) GIS-based landslide susceptibility mapping using a certainty factor model and its validation in the Chuetsu Area, Central Japan. In: Sassa K, Canuti P, Yin Y (eds) Landslide Science for a Safer Geoenvironment. Springer, Cham, pp 419–424Google Scholar
  18. Felicísimo ÁM, Cuartero A, Remondo J, Quirós E (2013) Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: a comparative study. Landslides 10:175–189CrossRefGoogle Scholar
  19. Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4:1–58CrossRefGoogle Scholar
  20. Gil D, Johnsson M (2010) Supervised SOM based architecture versus multilayer perceptron and RBF networks, Proceedings of the Linköping Electronic Conference, pp 15–24Google Scholar
  21. Gorsevski PV, Brown MK, Panter K, Onasch CM (2016) Landslide detection and susceptibility mapping using LiDAR and an artificial neural network approach: a case study in the Cuyahoga Valley National Park, Ohio. Landslides 13:467–484CrossRefGoogle Scholar
  22. Guzzetti F, Reichenbach P, Ardizzone F, Cardinali M, Galli M (2006) Estimating the quality of landslide susceptibility models. Geomorphology 81:166–184CrossRefGoogle Scholar
  23. Hong H, Pradhan B, Xu C, Tien Bui D (2015a) Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 133:266–281CrossRefGoogle Scholar
  24. Hong H, Xu C, Revhaug I, Tien Bui D (2015b) Spatial prediction of landslide hazard at the Yihuang area (China): a comparative study on the predictive ability of backpropagation multi-layer perceptron neural networks and radial basic function neural networks. In: Robbi Sluter C, Madureira Cruz CB, Leal de Menezes PM (eds) Cartography – Maps Connecting the World. Springer, Cham, pp 175–188Google Scholar
  25. Jaafari A, Najafi A, Pourghasemi H, Rezaeian J, Sattarian A (2014) GIS-based frequency ratio and index of entropy models for landslide susceptibility assessment in the Caspian forest, northern Iran. Int J Environ Sci Technol 11:909–926CrossRefGoogle Scholar
  26. Kim T, Chung BD, Lee JS (2016) Incorporating receiver operating characteristics into naive Bayes for unbalanced data classification. Computing 99:1–16Google Scholar
  27. Kimeldorf G, Wahba G (1971) Some results on Tchebycheffian spline functions. J Math Anal Appl 33:82–95CrossRefGoogle Scholar
  28. Kumar R, Anbalagan R (2015) Landslide susceptibility zonation in part of Tehri reservoir region using frequency ratio, fuzzy logic and GIS. J Earth Syst Sci 124:431–448CrossRefGoogle Scholar
  29. Kumar R, Anbalagan R (2016) Landslide susceptibility mapping using analytical hierarchy process (AHP) in Tehri reservoir rim region, Uttarakhand. J Geol Soc India 87:271–286CrossRefGoogle Scholar
  30. Lee C, Lee GG (2006) Information gain and divergence-based feature selection for machine learning-based text categorization. Inf Process Manag 42:155–165CrossRefGoogle Scholar
  31. Lineback Gritzner M, Marcus WA, Aspinall R, Custer SG (2001) Assessing landslide potential using GIS, soil wetness modeling and topographic attributes, Payette River, Idaho. Geomorphology 37:149–165CrossRefGoogle Scholar
  32. Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc Lond A 209:415–446Google Scholar
  33. Mohammady M, Pourghasemi HR, Pradhan B (2012) Landslide susceptibility mapping at Golestan Province, Iran: a comparison between frequency ratio, Dempster–Shafer, and weights-of-evidence models. J Asian Earth Sci 61:221–236CrossRefGoogle Scholar
  34. Park S, Choi C, Kim B, Kim J (2013) Landslide susceptibility mapping using frequency ratio, analytic hierarchy process, logistic regression, and artificial neural network methods at the Inje area, Korea. Environ Earth Sci 68:1443–1464CrossRefGoogle Scholar
  35. Peng JB et al (2015) Heavy rainfall triggered loess-mudstone landslide and subsequent debris flow in Tianshui, China. Eng Geol 186:79–90CrossRefGoogle Scholar
  36. Pham BT, Pradhan B, Tien Bui D, Prakash I, Dholakia MB (2016) A comparative study of different machine learning methods for landslide susceptibility assessment: a case study of Uttarakhand area (India). Environ Model Softw 84:240–250CrossRefGoogle Scholar
  37. Pham BT, Tien Bui D, Pourghasemi HR, Indra P, Dholakia M (2017) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol 128:255–273CrossRefGoogle Scholar
  38. Pham BT, Tien Bui D, Pourghasemi HR, Indra P, Dholakia MB (2015) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol 1–19Google Scholar
  39. Pradhan B (2010) Landslide susceptibility mapping of a catchment area using frequency ratio, fuzzy logic and multivariate logistic regression approaches. J Indian Soc Remote Sens 38:301–320CrossRefGoogle Scholar
  40. Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–365CrossRefGoogle Scholar
  41. Pradhan B, Abokharima MH, Jebur MN, Tehrany MS (2014) Land subsidence susceptibility mapping at Kinta Valley (Malaysia) using the evidential belief function model in GIS. Nat Hazards 73:1019–1042CrossRefGoogle Scholar
  42. Press SJ (1966) Linear combinations of non-central chi-square variates. Ann Math Stat 480–487Google Scholar
  43. Rao J, Scott A (1987) On simple adjustments to chi-square tests with sample survey data. Ann Stat 385–397Google Scholar
  44. Razandi Y, Pourghasemi HR, Neisani NS, Rahmati O (2015) Application of analytical hierarchy process, frequency ratio, and certainty factor models for groundwater potential mapping using GIS. Earth Sci Inf 8:867–883CrossRefGoogle Scholar
  45. Razavizadeh S, Solaimani K, Massironi M, Kavian A (2017) Mapping landslide susceptibility with frequency ratio, statistical index, and weights of evidence models: a case study in northern Iran. Environ Earth Sci 76:499CrossRefGoogle Scholar
  46. Regmi AD et al (2014) Application of frequency ratio, statistical index, and weights-of-evidence models and their comparison in landslide susceptibility mapping in Central Nepal Himalaya. Arab J Geosci 7:725–742CrossRefGoogle Scholar
  47. Regmi NR, Giardino JR, Vitek JD (2010) Modeling susceptibility to landslides using the weight of evidence approach: western Colorado, USA. Geomorphology 115:172–187CrossRefGoogle Scholar
  48. Sar N, Khan A, Chatterjee S, Das A, Mipun BS (2016) Coupling of analytical hierarchy process and frequency ratio based spatial prediction of soil erosion susceptibility in Keleghai river basin, India. International Soil and Water Conservation ResearchGoogle Scholar
  49. Satorra A, Bentler PM (2001) A scaled difference chi-square test statistic for moment structure analysis. Psychometrika 66:507–514CrossRefGoogle Scholar
  50. Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, CambridgeGoogle Scholar
  51. Shahabi H, Hashim M, Ahmad BB (2015) Remote sensing and GIS-based landslide susceptibility mapping using frequency ratio, logistic regression, and fuzzy logic methods at the central Zab basin, Iran. Environ Earth Sci 73:8647–8668CrossRefGoogle Scholar
  52. Soria D, Garibaldi JM, Ambrogi F, Biganzoli EM, Ellis IO (2011) A ‘non-parametric’version of the naive Bayes classifier. Knowl-Based Syst 24:775–784CrossRefGoogle Scholar
  53. Tien Bui D, Nguyen QP, Hoang N-D, Klempe H (2017) A novel fuzzy K-nearest neighbor inference model with differential evolution for spatial prediction of rainfall-induced shallow landslides in a tropical hilly area using GIS. Landslides 14:1–17CrossRefGoogle Scholar
  54. Tien Bui D, Pradhan B, Lofman O, Revhaug I (2012) Landslide susceptibility assessment in vietnam using support vector machines, decision tree, and Naive Bayes Models. Math Probl Eng 2012Google Scholar
  55. Tien Bui D, Pradhan B, Revhaug I, Tran CT (2014) A comparative assessment between the application of fuzzy unordered rules induction algorithm and J48 decision tree models in spatial prediction of shallow landslides at Lang Son City, Vietnam. In: Srivastava PK, Mukherjee S, Gupta M, Islam T (eds) Remote Sensing Applications in Environmental Research. Springer, New York, pp 87–111Google Scholar
  56. Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13:361–378CrossRefGoogle Scholar
  57. Trigila A, Iadanza C, Esposito C, Scarascia-Mugnozza G (2015) Comparison of logistic regression and random forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy). Geomorphology 249:119–136CrossRefGoogle Scholar
  58. Tsangaratos P, Ilia I (2016) Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: the influence of models complexity and training dataset size. Catena 145:164–179CrossRefGoogle Scholar
  59. Van Westen C (2004) Geo-information tools for landslide risk assessment: an overview of recent developments, Proceedings 9th International Symposium on Landslides. Balkema, Amsterdam, pp 39–56Google Scholar
  60. Walter S (2002) Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data. Stat Med 21:1237–1256CrossRefGoogle Scholar
  61. Wang L-J, Guo M, Sawada K, Lin J, Zhang J (2016) A comparative study of landslide susceptibility maps using logistic regression, frequency ratio, decision tree, weights of evidence and artificial neural network. Geosci J 20:117–136CrossRefGoogle Scholar
  62. Witten IH, Frank E, Mark AH (2011) Data mining: practical machine learning tools and techniques. 3rd edn. Morgan Kaufmann, BurlingtonGoogle Scholar
  63. Wu X et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37CrossRefGoogle Scholar
  64. Wu YM, Lan HX, Gao X, Li LP, Yang ZH (2015) A simplified physically based coupled rainfall threshold model for triggering landslides. Eng Geol 195:63–69CrossRefGoogle Scholar
  65. Youssef AM, Pourghasemi HR, El-Haddad BA, Dhahry BK (2016) Landslide susceptibility maps using different probabilistic and bivariate statistical models and comparison of their performance at Wadi Itwad Basin, Asir region, Saudi Arabia. Bull Eng Geol Environ 75:63–87CrossRefGoogle Scholar
  66. Zhang G et al (2016) Integration of the statistical index method and the analytic hierarchy process technique for the assessment of landslide susceptibility in Huizhou, China. Catena 142:233–244CrossRefGoogle Scholar
  67. Zhang M, Yin YP, Huang BL (2015) Mechanisms of rainfall-induced landslides in gently inclined red beds in the eastern Sichuan Basin, SW China. Landslides 12:973–983CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.College of Geology & EnvironmentXi’an University of Science and TechnologyXi’anChina
  2. 2.Shandong Provincial Key Laboratory of Depositional Mineralization & Sedimentary MineralsShandong University of Science and TechnologyQingdaoChina
  3. 3.Key Laboratory of Virtual Geographic EnvironmentNanjing Normal University, Ministry of EducationNanjingChina
  4. 4.State Key Laboratory Cultivation Base of Geographical Environment Evolution (Jiangsu Province)NanjingChina
  5. 5.Jiangsu Center for Collaborative Innovation in Geographic Information Resource Development and ApplicationNanjingChina
  6. 6.Department of Business Administration and Computer Sciences, Faculty of Arts and SciencesTelemark University CollegeBø i TelemarkNorway
  7. 7.School of Systems, Management and Leadership, Faculty of Engineering and IT University of Technology SydneyUltimoAustralia
  8. 8.Department of Energy and Mineral Resources Engineering, Choongmu-gwanSejong UniversitySeoulRepublic of Korea

Personalised recommendations