Abstract
Landslide susceptibility assessment has been conducted at the Pauri Garhwal area of Uttarakhand state, India, an area affected by numerous landslides causing significant losses of life, infrastructure and property every year. Decision tree-based machine learning methods, namely Random Forest (RF), Logistic Model Trees (LMT), Best First Decision Trees (BFDT) and Classification and Regression Trees (CART) have been used, and results are compared herein for proper spatial prediction of landslides. Analysis of the data has been done considering sixteen conditioning factors (i.e., slope angle, elevation, slope aspect, profile curvature, land cover, curvature, lithology, plan curvature, soil, distance to lineaments, lineament density, distance to roads, road density, distance to river, river density and rainfall), and 1295 historical landslide polygons. Models were validated and compared using Receiver Operating Characteristics (ROC) curve and statistical indices. The results show that the RF model has the highest predictive capability, followed by the LMT, BFDT and CART models, respectively, and indicate that although all four methods have shown good results, the performance of the RF method is the best for landslide spatial prediction.
Similar content being viewed by others
References
Aertsen W, Kint V, van Orshoven J, Özkan K, Muys B (2010) Comparison and ranking of different modelling techniques for prediction of site index in Mediterranean mountain forests. Ecol Model 221:1119–1130
Akgun A (2012) A comparison of landslide susceptibility maps produced by logistic regression, multi-criteria decision, and likelihood ratio methods: a case study at İzmir, Turkey. Landslides 9:93–106
Bai S-B, Wang J, Lü G-N, Zhou P-G, Hou S-S, Xu S-N (2010) GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the three gorges area, China. Geomorphology 115:23–31
Bai S, Wang J, Zhang Z, Cheng C (2012) Combined landslide susceptibility mapping after Wenchuan earthquake at the Zhouqu segment in the Bailongjiang Basin, China. Catena 99:18–25. doi:10.1016/j.catena.2012.06.012
Bennett ND, Croke BF, Guariso G, Guillaume JH, Hamilton SH, Jakeman AJ, Marsili-Libelli S, Newham LT, Norton JP, Perrin C (2013) Characterising performance of environmental models. Environ Model Softw 40:1–20
Breiman L (2001) Random forests. Mach Learn 45:5–32
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton
Catani F, Lagomarsino D, Segoni S, Tofani V (2013) Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Nat Hazards Earth Syst Sci 13:2815–2831
Cevik E, Topal T (2003) GIS-based landslide susceptibility mapping for a problematic segment of the natural gas pipeline, Hendek (Turkey). Environ Geol 44:949–962
Chakraborty S, Pradhan R (2012) Development of GIS based landslide information system for the region of East Sikkim. Int J Comput Appl 49:5–9
Chang K-T, Chiang S-H, Hsu M-L (2007) Modeling typhoon- and earthquake-induced landslides in a mountainous watershed using logistic regression. Geomorphology 89:335–347
Chen W, Pourghasemi HR, Naghibi SA (2017) Prioritization of landslide conditioning factors and its spatial modeling in Shangnan County, China using GIS-based data mining algorithms. Bull Eng Geol Environ 75:1–19
Das I, Stein A, Kerle N, Dadhwal VK (2012) Landslide susceptibility mapping along road corridors in the Indian Himalayas using Bayesian logistic regression models. Geomorphology 179:116–125. doi:10.1016/j.geomorph.2012.08.004
Fisher DF, Monty RA, Glucksberg S (1969) Visual confusion matrices: fact or artifact? J psychol 71:111–125
García-Rodríguez MJ, Malpica JA, Benito B, Díaz M (2008) Susceptibility assessment of earthquake-triggered landslides in El Salvador using logistic regression. Geomorphology 95:172–191. doi:10.1016/j.geomorph.2007.06.001
Guzzetti F (2006) Landslide Hazard and Risk Assessment. PhD thesis. University of Bonn
Guzzetti F, Reichenbach P, Cardinali M, Galli M, Ardizzone F (2005) Probabilistic landslide hazard assessment at the basin scale. Geomorphology 72:272–299
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36
Hong H, Pradhan B, Xu C, Bui DT (2015) Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 133:266–281
Hong H, Naghibi SA, Pourghasemi HR, Pradhan B (2016a) GIS-based landslide spatial modeling in Ganzhou City, China. Arab J Geosci 9:1–26
Hong H, Pourghasemi HR, Pourtaghi ZS (2016b) Landslide susceptibility assessment in Lianhua County (China): a comparison between a random forest data mining technique and bivariate and multivariate statistical models. Geomorphology 259:105–118
Hong H, Pradhan B, Bui DT, Xu C, Youssef AM, Chen W (2016c) Comparison of four kernel functions used in support vector machines for landslide susceptibility mapping: a case study at Suichuan area (China) Geomatics, Natural Hazards and Risk:1–26
Hong H, Pradhan B, Jebur MN, Bui DT, Xu C, Akgun A (2016d) Spatial prediction of landslide hazard at the Luxi area (China) using support vector machines. Environmental Earth Sciences 75:40
Hong H, Chen W, Xu C, Youssef AM, Pradhan B, Tien Bui D (2017a) Rainfall-induced landslide susceptibility assessment at the Chongren area (China) using frequency ratio, certainty factor, and index of entropy. Geocarto Int 32:139–154
Hong H, Ilia I, Tsangaratos P, Chen W, Xu C (2017b) A hybrid fuzzy weight of evidence method in landslide susceptibility analysis on the Wuyuan area, China. Geomorphology 290:1–16
Ilia I, Tsangaratos P (2016) Applying weight of evidence method and sensitivity analysis to produce a landslide susceptibility map. Landslides 13:379–397
Jin X, Xu A, Bie R, Guo P (2006) Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. In: International Workshop on Data Mining for Biomedical Applications. Springer, pp 106–115
Kanungo D, Arora M, Sarkar S, Gupta R (2009) Landslide susceptibility zonation (LSZ) mapping-a review. J South Asia Disaster Stud 2:81–105
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59:161–205
Lee S, Min K (2001) Statistical analysis of landslide susceptibility at Yongin, Korea. Environ Geol 40:1095–1113
Liaw A, Wiener M (2002) Classification and regression by random. Forest R News 2:18–22
Naghibi SA, Pourghasemi HR, Dixon B (2016) GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environ Monit Assess 188:1–27
Nayab N, Scheid J (2011) Disadvantages to Using Decision Trees. http://www.brighthubpmcom/project-planning/106005-disadvantages-to-using-decision-trees/
NCEP (2014) Global Weather Data for SWAT. http://globalweather.tamu.edu/home
Oh H-J, Pradhan B (2011) Application of a neuro-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area. Comput Geosci 37:1264–1276
Ohlmacher GC (2007) Plan curvature and landslide probability in regions dominated by earth flows and earth slides. Eng Geol 91:117–134
Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343
Pham BT, Tien Bui D, Indra P, Dholakia M (2015a) Landslide susceptibility assessment at a part of Uttarakhand Himalaya, India using GIS–based statistical approach of frequency ratio method. Int J Eng Res Technology 4:338–344
Pham BT, Tien Bui D, Pourghasemi HR, Indra P, Dholakia MB (2015b) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve Bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol 122:1–19. doi:10.1007/s00704-015-1702-9
Pham BT, Bui DT, Dholakia MB, Prakash I, Pham HV, Mehmood K, Le HQ (2016a) A novel ensemble classifier of rotation forest and Naïve Bayer for landslide susceptibility assessment at the Luc Yen District, Yen Bai Province (Viet Nam) using GIS. Geomat Nat Haz Risk:1–23. doi:10.1080/19475705.2016.1255667
Pham BT, Bui DT, Prakash I, Dholakia M (2016b) Evaluation of predictive ability of support vector machines and naive Bayes trees methods for spatial prediction of landslides in Uttarakhand state (India) using GIS. J Geom 10:71–79
Pham BT, Pradhan B, Tien Bui D, Prakash I, Dholakia MB (2016c) A comparative study of different machine learning methods for landslide susceptibility assessment: a case study of Uttarakhand area (India). Environ Model Softw 84:240–250. doi:10.1016/j.envsoft.2016.07.005
Pham BT, Tien Bui D, Dholakia MB, Prakash I, Pham HV (2016d) A comparative study of least square support vector machines and multiclass alternating decision trees for spatial prediction of rainfall-induced landslides in a tropical cyclones area. Geotech Geol Eng 34:1–18. doi:10.1007/s10706-016-9990-0
Pham BT, Tien Bui D, Pham HV (2016e) Spatial prediction of rainfall induced landslides using Bayesian network at Luc Yen District, Yen Bai Province (Viet Nam). In: International Conference on Environmental Issues in Mining and Natural Resources Development (EMNR 2016), Hanoi University of Mining and Geology (HUMG), Viet Nam, pp 1–10
Pham BT, Tien Bui D, Pham HV, Le HQ, Prakash I, Dholakia MB (2016f) Landslide hazard assessment using random subspace fuzzy rules based classifier ensemble and probability analysis of rainfall data: a case study at mu Cang Chai District, Yen Bai Province (Viet Nam). J Indian Soc Remote Sens 35:1–11. doi:10.1007/s12524-016-0620-3
Pham BT, Tien Bui D, Prakash I, Dholakia MB (2016g) Rotation forest fuzzy rule-based classifier ensemble for spatial prediction of landslides using GIS. Nat Hazards 83:1–31. doi:10.1007/s11069-016-2304-2
Pham BT, Tien Bui D, Prakash I, Dholakia MB (2017a) Hybrid integration of multilayer perceptron neural networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. Catena 149, Part 1:52–63 doi:10.1016/j.catena.2016.09.007
Pham BT, Tien Bui D, Prakash I, Nguyen LH, Dholakia MB (2017b) A comparative study of sequential minimal optimization-based support vector machines, vote feature intervals, and logistic regression in landslide susceptibility assessment using GIS. Environmental Earth Sciences 76:371. doi:10.1007/s12665-017-6689-3
Pham BT, Bui DT, Prakash I (2017c) Landslide susceptibility assessment using bagging ensemble based alternating decision trees, logistic regression and J48 decision trees methods: a comparative study. Geotech Geol Eng: 1–15. doi:10.1007/s10706-017-0264-2
Pourghasemi HR, Kerle N (2016) Random forests and evidential belief function-based landslide susceptibility assessment in western Mazandaran Province, Iran. Environmental Earth Sciences 75:1–17
Pourghasemi HR, Rossi M (2016) Landslide susceptibility modeling in a landslide prone area in Mazandarn Province, north of Iran: a comparison between GLM, GAM, MARS, and M-AHP methods. Theor Appl Climatol:1–25
Pourghasemi HR, Pradhan B, Gokceoglu C (2012) Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed, Iran. Nat Hazards 63:965–996
Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–365. doi:10.1016/j.cageo.2012.08.023
Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199
Rahmati O, Pourghasemi HR, Melesse AM (2016) Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: a case study at Mehran region, Iran. Catena 137:360–372
Saito H, Nakayama D, Matsuyama H (2009) Comparison of landslide susceptibility based on a decision-tree model and actual landslide occurrence: the Akaishi Mountains, Japan. Geomorphology 109:108–121
Sarkar S, Kanungo D (2004) An integrated approach for landslide susceptibility mapping using remote sensing and GIS. Photogramm Eng Remote Sens 70:617–625
Shi H (2007) Best-first decision tree learning. PhD thesis. The University of Waikato
Shirzadi A, Bui DT, Pham BT, Solaimani K, Chapi K, Kavian A, Shahabi H, Revhaug I (2017a) Shallow landslide susceptibility assessment using a novel hybrid intelligence approach. Environ Earth Sci 76:60
Shirzadi A, Shahabi H, Chapi K, Bui DT, Pham BT, Shahedi K, Ahmad BB (2017b) A comparative study between popular statistical and machine learning methods for simulating volume of landslides. Catena 157:213–226
Sidle RC, Ochiai H (2006) Landslides: Processes, Prediction, and Land Use. Vol 18. American Geophysical Union
Singh SR, Murthy HA, Gonsalves TA (2010) Feature selection for text classification based on Gini coefficient of inequality. International Conference on Fuzzy System and Data Mining 10:76–85
Tien Bui D, Ho T-C, Pradhan B, Pham B-T, Nhu V-H, Revhaug I (2016a) GIS-based modeling of rainfall-induced landslides using data mining-based functional trees classifier with AdaBoost, bagging, and MultiBoost ensemble frameworks. Environ Earth Sci 75:1–22. doi:10.1007/s12665-016-5919-4
Tien Bui D, Pham BT, Nguyen QP, Hoang N-D (2016b) Spatial prediction of rainfall-induced shallow landslides using hybrid integration approach of least-squares support vector machines and differential evolution optimization: a case study in Central Vietnam. Int J Digital Earth 9:1–21. doi:10.1080/17538947.2016.1169561
Tsangaratos P, Benardos A (2014) Estimating landslide susceptibility through a artificial neural network classifier. Nat Hazards 74:1489–1516
Tsangaratos P, Ilia I (2016a) Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: the influence of models complexity and training dataset size. Catena 145:164–179
Tsangaratos P, Ilia I (2016b) Landslide susceptibility mapping using a modified decision tree classifier in the Xanthi perfection, Greece. Landslides 13:305–320
Tsangaratos P, Ilia I, Hong H, Chen W, Xu C (2016) Applying information theory and GIS-based quantitative methods to produce landslide susceptibility maps in Nancheng County, China. Landslides:1–21
van den Eeckhaut M, Vanwalleghem T, Poesen J, Govers G, Verstraeten G, Vandekerckhove L (2006) Prediction of landslide susceptibility using rare events logistic regression: a case-study in the Flemish Ardennes (Belgium). Geomorphology 76:392–410. doi:10.1016/j.geomorph.2005.12.003
Varnes DJ (1984) Landslide hazard zonation: a review of principles and practice. Vol 3. UNESCO, Paris
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37
Yalcin A, Reis S, Aydinoglu AC, Yomralioglu T (2011) A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey. Catena 85:274–287. doi:10.1016/j.catena.2011.01.014
Yesilnacar E, Topal T (2005) Landslide susceptibility mapping: a comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng Geol 79:251–266
Youssef AM, Pradhan B, Pourghasemi HR, Abdullahi S (2015) Landslide susceptibility assessment at Wadi Jawrah Basin, Jizan region, Saudi Arabia using two bivariate models in GIS. Geosciences Journal:1–21
Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM (2016) Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir region, Saudi Arabia. Landslides 13:839–856
Zabihi M, Pourghasemi HR, Pourtaghi ZS, Behzadfar M (2016) GIS-based multivariate adaptive regression spline and random forest models for groundwater potential mapping in Iran. Environmental Earth Sciences 75:1–19
Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on inbalanced data. ACM Sigkdd Explor Newsletter 6:80–89
Acknowledgements
The authors are thankful to the Director, Bhaskarcharya Institute for Space Applications and Geo-Informatics (BISAG), Department of Science & Technology, Government of Gujarat, Gandhinagar, Gujarat, India, for providing facilities to carry out this research work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pham, B.T., Khosravi, K. & Prakash, I. Application and Comparison of Decision Tree-Based Machine Learning Methods in Landside Susceptibility Assessment at Pauri Garhwal Area, Uttarakhand, India. Environ. Process. 4, 711–730 (2017). https://doi.org/10.1007/s40710-017-0248-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40710-017-0248-5