Skip to main content

Advertisement

Log in

Application and Comparison of Decision Tree-Based Machine Learning Methods in Landside Susceptibility Assessment at Pauri Garhwal Area, Uttarakhand, India

  • Original Article
  • Published:
Environmental Processes Aims and scope Submit manuscript

Abstract

Landslide susceptibility assessment has been conducted at the Pauri Garhwal area of Uttarakhand state, India, an area affected by numerous landslides causing significant losses of life, infrastructure and property every year. Decision tree-based machine learning methods, namely Random Forest (RF), Logistic Model Trees (LMT), Best First Decision Trees (BFDT) and Classification and Regression Trees (CART) have been used, and results are compared herein for proper spatial prediction of landslides. Analysis of the data has been done considering sixteen conditioning factors (i.e., slope angle, elevation, slope aspect, profile curvature, land cover, curvature, lithology, plan curvature, soil, distance to lineaments, lineament density, distance to roads, road density, distance to river, river density and rainfall), and 1295 historical landslide polygons. Models were validated and compared using Receiver Operating Characteristics (ROC) curve and statistical indices. The results show that the RF model has the highest predictive capability, followed by the LMT, BFDT and CART models, respectively, and indicate that although all four methods have shown good results, the performance of the RF method is the best for landslide spatial prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Aertsen W, Kint V, van Orshoven J, Özkan K, Muys B (2010) Comparison and ranking of different modelling techniques for prediction of site index in Mediterranean mountain forests. Ecol Model 221:1119–1130

    Article  Google Scholar 

  • Akgun A (2012) A comparison of landslide susceptibility maps produced by logistic regression, multi-criteria decision, and likelihood ratio methods: a case study at İzmir, Turkey. Landslides 9:93–106

    Article  Google Scholar 

  • Bai S-B, Wang J, Lü G-N, Zhou P-G, Hou S-S, Xu S-N (2010) GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the three gorges area, China. Geomorphology 115:23–31

    Article  Google Scholar 

  • Bai S, Wang J, Zhang Z, Cheng C (2012) Combined landslide susceptibility mapping after Wenchuan earthquake at the Zhouqu segment in the Bailongjiang Basin, China. Catena 99:18–25. doi:10.1016/j.catena.2012.06.012

    Article  Google Scholar 

  • Bennett ND, Croke BF, Guariso G, Guillaume JH, Hamilton SH, Jakeman AJ, Marsili-Libelli S, Newham LT, Norton JP, Perrin C (2013) Characterising performance of environmental models. Environ Model Softw 40:1–20

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton

    Google Scholar 

  • Catani F, Lagomarsino D, Segoni S, Tofani V (2013) Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Nat Hazards Earth Syst Sci 13:2815–2831

    Article  Google Scholar 

  • Cevik E, Topal T (2003) GIS-based landslide susceptibility mapping for a problematic segment of the natural gas pipeline, Hendek (Turkey). Environ Geol 44:949–962

    Article  Google Scholar 

  • Chakraborty S, Pradhan R (2012) Development of GIS based landslide information system for the region of East Sikkim. Int J Comput Appl 49:5–9

    Google Scholar 

  • Chang K-T, Chiang S-H, Hsu M-L (2007) Modeling typhoon- and earthquake-induced landslides in a mountainous watershed using logistic regression. Geomorphology 89:335–347

    Article  Google Scholar 

  • Chen W, Pourghasemi HR, Naghibi SA (2017) Prioritization of landslide conditioning factors and its spatial modeling in Shangnan County, China using GIS-based data mining algorithms. Bull Eng Geol Environ 75:1–19

    Google Scholar 

  • Das I, Stein A, Kerle N, Dadhwal VK (2012) Landslide susceptibility mapping along road corridors in the Indian Himalayas using Bayesian logistic regression models. Geomorphology 179:116–125. doi:10.1016/j.geomorph.2012.08.004

    Article  Google Scholar 

  • Fisher DF, Monty RA, Glucksberg S (1969) Visual confusion matrices: fact or artifact? J psychol 71:111–125

    Article  Google Scholar 

  • García-Rodríguez MJ, Malpica JA, Benito B, Díaz M (2008) Susceptibility assessment of earthquake-triggered landslides in El Salvador using logistic regression. Geomorphology 95:172–191. doi:10.1016/j.geomorph.2007.06.001

    Article  Google Scholar 

  • Guzzetti F (2006) Landslide Hazard and Risk Assessment. PhD thesis. University of Bonn

  • Guzzetti F, Reichenbach P, Cardinali M, Galli M, Ardizzone F (2005) Probabilistic landslide hazard assessment at the basin scale. Geomorphology 72:272–299

    Article  Google Scholar 

  • Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36

    Article  Google Scholar 

  • Hong H, Pradhan B, Xu C, Bui DT (2015) Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 133:266–281

    Article  Google Scholar 

  • Hong H, Naghibi SA, Pourghasemi HR, Pradhan B (2016a) GIS-based landslide spatial modeling in Ganzhou City, China. Arab J Geosci 9:1–26

    Article  Google Scholar 

  • Hong H, Pourghasemi HR, Pourtaghi ZS (2016b) Landslide susceptibility assessment in Lianhua County (China): a comparison between a random forest data mining technique and bivariate and multivariate statistical models. Geomorphology 259:105–118

    Article  Google Scholar 

  • Hong H, Pradhan B, Bui DT, Xu C, Youssef AM, Chen W (2016c) Comparison of four kernel functions used in support vector machines for landslide susceptibility mapping: a case study at Suichuan area (China) Geomatics, Natural Hazards and Risk:1–26

  • Hong H, Pradhan B, Jebur MN, Bui DT, Xu C, Akgun A (2016d) Spatial prediction of landslide hazard at the Luxi area (China) using support vector machines. Environmental Earth Sciences 75:40

    Article  Google Scholar 

  • Hong H, Chen W, Xu C, Youssef AM, Pradhan B, Tien Bui D (2017a) Rainfall-induced landslide susceptibility assessment at the Chongren area (China) using frequency ratio, certainty factor, and index of entropy. Geocarto Int 32:139–154

    Google Scholar 

  • Hong H, Ilia I, Tsangaratos P, Chen W, Xu C (2017b) A hybrid fuzzy weight of evidence method in landslide susceptibility analysis on the Wuyuan area, China. Geomorphology 290:1–16

    Article  Google Scholar 

  • Ilia I, Tsangaratos P (2016) Applying weight of evidence method and sensitivity analysis to produce a landslide susceptibility map. Landslides 13:379–397

    Article  Google Scholar 

  • Jin X, Xu A, Bie R, Guo P (2006) Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. In: International Workshop on Data Mining for Biomedical Applications. Springer, pp 106–115

  • Kanungo D, Arora M, Sarkar S, Gupta R (2009) Landslide susceptibility zonation (LSZ) mapping-a review. J South Asia Disaster Stud 2:81–105

    Google Scholar 

  • Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59:161–205

    Article  Google Scholar 

  • Lee S, Min K (2001) Statistical analysis of landslide susceptibility at Yongin, Korea. Environ Geol 40:1095–1113

    Article  Google Scholar 

  • Liaw A, Wiener M (2002) Classification and regression by random. Forest R News 2:18–22

    Google Scholar 

  • Naghibi SA, Pourghasemi HR, Dixon B (2016) GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environ Monit Assess 188:1–27

    Article  Google Scholar 

  • Nayab N, Scheid J (2011) Disadvantages to Using Decision Trees. http://www.brighthubpmcom/project-planning/106005-disadvantages-to-using-decision-trees/

  • NCEP (2014) Global Weather Data for SWAT. http://globalweather.tamu.edu/home

  • Oh H-J, Pradhan B (2011) Application of a neuro-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area. Comput Geosci 37:1264–1276

    Article  Google Scholar 

  • Ohlmacher GC (2007) Plan curvature and landslide probability in regions dominated by earth flows and earth slides. Eng Geol 91:117–134

    Article  Google Scholar 

  • Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343

    Article  Google Scholar 

  • Pham BT, Tien Bui D, Indra P, Dholakia M (2015a) Landslide susceptibility assessment at a part of Uttarakhand Himalaya, India using GIS–based statistical approach of frequency ratio method. Int J Eng Res Technology 4:338–344

    Google Scholar 

  • Pham BT, Tien Bui D, Pourghasemi HR, Indra P, Dholakia MB (2015b) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve Bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol 122:1–19. doi:10.1007/s00704-015-1702-9

    Article  Google Scholar 

  • Pham BT, Bui DT, Dholakia MB, Prakash I, Pham HV, Mehmood K, Le HQ (2016a) A novel ensemble classifier of rotation forest and Naïve Bayer for landslide susceptibility assessment at the Luc Yen District, Yen Bai Province (Viet Nam) using GIS. Geomat Nat Haz Risk:1–23. doi:10.1080/19475705.2016.1255667

  • Pham BT, Bui DT, Prakash I, Dholakia M (2016b) Evaluation of predictive ability of support vector machines and naive Bayes trees methods for spatial prediction of landslides in Uttarakhand state (India) using GIS. J Geom 10:71–79

    Article  Google Scholar 

  • Pham BT, Pradhan B, Tien Bui D, Prakash I, Dholakia MB (2016c) A comparative study of different machine learning methods for landslide susceptibility assessment: a case study of Uttarakhand area (India). Environ Model Softw 84:240–250. doi:10.1016/j.envsoft.2016.07.005

    Article  Google Scholar 

  • Pham BT, Tien Bui D, Dholakia MB, Prakash I, Pham HV (2016d) A comparative study of least square support vector machines and multiclass alternating decision trees for spatial prediction of rainfall-induced landslides in a tropical cyclones area. Geotech Geol Eng 34:1–18. doi:10.1007/s10706-016-9990-0

    Article  Google Scholar 

  • Pham BT, Tien Bui D, Pham HV (2016e) Spatial prediction of rainfall induced landslides using Bayesian network at Luc Yen District, Yen Bai Province (Viet Nam). In: International Conference on Environmental Issues in Mining and Natural Resources Development (EMNR 2016), Hanoi University of Mining and Geology (HUMG), Viet Nam, pp 1–10

  • Pham BT, Tien Bui D, Pham HV, Le HQ, Prakash I, Dholakia MB (2016f) Landslide hazard assessment using random subspace fuzzy rules based classifier ensemble and probability analysis of rainfall data: a case study at mu Cang Chai District, Yen Bai Province (Viet Nam). J Indian Soc Remote Sens 35:1–11. doi:10.1007/s12524-016-0620-3

    Google Scholar 

  • Pham BT, Tien Bui D, Prakash I, Dholakia MB (2016g) Rotation forest fuzzy rule-based classifier ensemble for spatial prediction of landslides using GIS. Nat Hazards 83:1–31. doi:10.1007/s11069-016-2304-2

    Article  Google Scholar 

  • Pham BT, Tien Bui D, Prakash I, Dholakia MB (2017a) Hybrid integration of multilayer perceptron neural networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. Catena 149, Part 1:52–63 doi:10.1016/j.catena.2016.09.007

  • Pham BT, Tien Bui D, Prakash I, Nguyen LH, Dholakia MB (2017b) A comparative study of sequential minimal optimization-based support vector machines, vote feature intervals, and logistic regression in landslide susceptibility assessment using GIS. Environmental Earth Sciences 76:371. doi:10.1007/s12665-017-6689-3

    Article  Google Scholar 

  • Pham BT, Bui DT, Prakash I (2017c) Landslide susceptibility assessment using bagging ensemble based alternating decision trees, logistic regression and J48 decision trees methods: a comparative study. Geotech Geol Eng: 1–15. doi:10.1007/s10706-017-0264-2

  • Pourghasemi HR, Kerle N (2016) Random forests and evidential belief function-based landslide susceptibility assessment in western Mazandaran Province, Iran. Environmental Earth Sciences 75:1–17

    Article  Google Scholar 

  • Pourghasemi HR, Rossi M (2016) Landslide susceptibility modeling in a landslide prone area in Mazandarn Province, north of Iran: a comparison between GLM, GAM, MARS, and M-AHP methods. Theor Appl Climatol:1–25

  • Pourghasemi HR, Pradhan B, Gokceoglu C (2012) Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed, Iran. Nat Hazards 63:965–996

    Article  Google Scholar 

  • Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–365. doi:10.1016/j.cageo.2012.08.023

    Article  Google Scholar 

  • Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199

    Article  Google Scholar 

  • Rahmati O, Pourghasemi HR, Melesse AM (2016) Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: a case study at Mehran region, Iran. Catena 137:360–372

    Article  Google Scholar 

  • Saito H, Nakayama D, Matsuyama H (2009) Comparison of landslide susceptibility based on a decision-tree model and actual landslide occurrence: the Akaishi Mountains, Japan. Geomorphology 109:108–121

    Article  Google Scholar 

  • Sarkar S, Kanungo D (2004) An integrated approach for landslide susceptibility mapping using remote sensing and GIS. Photogramm Eng Remote Sens 70:617–625

    Article  Google Scholar 

  • Shi H (2007) Best-first decision tree learning. PhD thesis. The University of Waikato

  • Shirzadi A, Bui DT, Pham BT, Solaimani K, Chapi K, Kavian A, Shahabi H, Revhaug I (2017a) Shallow landslide susceptibility assessment using a novel hybrid intelligence approach. Environ Earth Sci 76:60

    Article  Google Scholar 

  • Shirzadi A, Shahabi H, Chapi K, Bui DT, Pham BT, Shahedi K, Ahmad BB (2017b) A comparative study between popular statistical and machine learning methods for simulating volume of landslides. Catena 157:213–226

    Article  Google Scholar 

  • Sidle RC, Ochiai H (2006) Landslides: Processes, Prediction, and Land Use. Vol 18. American Geophysical Union

  • Singh SR, Murthy HA, Gonsalves TA (2010) Feature selection for text classification based on Gini coefficient of inequality. International Conference on Fuzzy System and Data Mining 10:76–85

    Google Scholar 

  • Tien Bui D, Ho T-C, Pradhan B, Pham B-T, Nhu V-H, Revhaug I (2016a) GIS-based modeling of rainfall-induced landslides using data mining-based functional trees classifier with AdaBoost, bagging, and MultiBoost ensemble frameworks. Environ Earth Sci 75:1–22. doi:10.1007/s12665-016-5919-4

    Article  Google Scholar 

  • Tien Bui D, Pham BT, Nguyen QP, Hoang N-D (2016b) Spatial prediction of rainfall-induced shallow landslides using hybrid integration approach of least-squares support vector machines and differential evolution optimization: a case study in Central Vietnam. Int J Digital Earth 9:1–21. doi:10.1080/17538947.2016.1169561

    Article  Google Scholar 

  • Tsangaratos P, Benardos A (2014) Estimating landslide susceptibility through a artificial neural network classifier. Nat Hazards 74:1489–1516

    Article  Google Scholar 

  • Tsangaratos P, Ilia I (2016a) Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: the influence of models complexity and training dataset size. Catena 145:164–179

    Article  Google Scholar 

  • Tsangaratos P, Ilia I (2016b) Landslide susceptibility mapping using a modified decision tree classifier in the Xanthi perfection, Greece. Landslides 13:305–320

    Article  Google Scholar 

  • Tsangaratos P, Ilia I, Hong H, Chen W, Xu C (2016) Applying information theory and GIS-based quantitative methods to produce landslide susceptibility maps in Nancheng County, China. Landslides:1–21

  • van den Eeckhaut M, Vanwalleghem T, Poesen J, Govers G, Verstraeten G, Vandekerckhove L (2006) Prediction of landslide susceptibility using rare events logistic regression: a case-study in the Flemish Ardennes (Belgium). Geomorphology 76:392–410. doi:10.1016/j.geomorph.2005.12.003

    Article  Google Scholar 

  • Varnes DJ (1984) Landslide hazard zonation: a review of principles and practice. Vol 3. UNESCO, Paris

    Google Scholar 

  • Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37

    Article  Google Scholar 

  • Yalcin A, Reis S, Aydinoglu AC, Yomralioglu T (2011) A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey. Catena 85:274–287. doi:10.1016/j.catena.2011.01.014

    Article  Google Scholar 

  • Yesilnacar E, Topal T (2005) Landslide susceptibility mapping: a comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng Geol 79:251–266

    Article  Google Scholar 

  • Youssef AM, Pradhan B, Pourghasemi HR, Abdullahi S (2015) Landslide susceptibility assessment at Wadi Jawrah Basin, Jizan region, Saudi Arabia using two bivariate models in GIS. Geosciences Journal:1–21

  • Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM (2016) Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir region, Saudi Arabia. Landslides 13:839–856

    Article  Google Scholar 

  • Zabihi M, Pourghasemi HR, Pourtaghi ZS, Behzadfar M (2016) GIS-based multivariate adaptive regression spline and random forest models for groundwater potential mapping in Iran. Environmental Earth Sciences 75:1–19

    Article  Google Scholar 

  • Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on inbalanced data. ACM Sigkdd Explor Newsletter 6:80–89

Download references

Acknowledgements

The authors are thankful to the Director, Bhaskarcharya Institute for Space Applications and Geo-Informatics (BISAG), Department of Science & Technology, Government of Gujarat, Gandhinagar, Gujarat, India, for providing facilities to carry out this research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Binh Thai Pham.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pham, B.T., Khosravi, K. & Prakash, I. Application and Comparison of Decision Tree-Based Machine Learning Methods in Landside Susceptibility Assessment at Pauri Garhwal Area, Uttarakhand, India. Environ. Process. 4, 711–730 (2017). https://doi.org/10.1007/s40710-017-0248-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40710-017-0248-5

Keywords

Navigation