Skip to main content
Log in

Performance assessment of artificial neural network using chi-square and backward elimination feature selection methods for landslide susceptibility analysis

  • Original Article
  • Published:
Environmental Earth Sciences Aims and scope Submit manuscript

Abstract

In the machine learning models, it is desirable to remove most redundant features from the data set to reduce the data processing time and to improve accuracy of the models. In this paper, chi-square (CS) and backward elimination (BE), which are well-known feature selection methods, were used for the optimum selection of input features/factors for training artificial neural network (ANN) for landslide susceptibility modeling. Initially, seventeen landslide affecting factors were considered for the ANN model which were reduced to twelve and eleven based on the ANN optimized by CS (CSANN) and BE (BEANN), respectively. Accuracy (ACC), Kappa Index, root mean square error (RMSE), and area under the receiver operating characteristic (AUROC) curve were used to evaluate and validate performance of the models. Results show that both the feature selection methods (CS and BE) improved significantly performance of the hybrid BEANN and CSANN models in comparison to single ANN model. Results indicated that performance of the BEANN model (AUROC 0.963; ACC 91.31) is the best in comparison to CSANN (AUROC 0.950; ACC 89.80) and ANN (AUROC 0.949; ACC 76.40) models in the accurate prediction of landslide susceptible areas/zones. Therefore, it is reasonable to state that the BE is more effective feature selection method than the CS in improving performance of the ANN model and thus, it can be used for better landslide susceptibility analysis for the landslide management of the area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Abedini M, Ghasemian B, Shirzadi A, Bui DT (2019) A comparative study of support vector machine and logistic model tree classifiers for shallow landslide susceptibility modeling. Environ Earth Sci 78(18):560

    Article  Google Scholar 

  • Acharya TD (2018) Regional scale landslide hazard assessment using machine learning methods in Nepal (Doctoral dissertation). Kangwon National University, Chuncheon, Retrieved from KERIS-RISS. http://www.riss.kr/link?id=T14734504. Accessed 30 Sept 2020

  • Al-Jarrah O, Siddiqui A, Elsalamouny M, Yoo PD, Muhaidat S, Kim K (2014) Machine-learning-based feature selection techniques for large-scale network intrusion detection. In: 2014 IEEE 34th international conference on distributed computing systems workshops (ICDCSW). IEEE, pp 177–181

  • Alshalabi H, Tiun S, Omar N, Albared MJPT (2013) Experiments on the use of feature selection and machine learning methods in automatic malay text categorization. Proc Technol 11:748–754

    Article  Google Scholar 

  • Al-Subaihi AA (2002) Variable selection in multivariable regression using SAS/IML. J Stat Softw 7(12):1–20

    Article  Google Scholar 

  • Arabameri A, Saha S, Roy J, Chen W, Blaschke T, Tien Bui D (2020) Landslide susceptibility evaluation and management using different machine learning methods in the Gallicash River Watershed, Iran. Remote Sens 12(3):475

    Article  Google Scholar 

  • Armaş I (2012) Weights of evidence method for landslide susceptibility mapping, Prahova Subcarpathians, Romania. Nat Hazards 60(3):937–950

    Article  Google Scholar 

  • Baeza C, Lantada N, Amorim S (2016) Statistical and spatial analysis of landslide susceptibility maps with different classification systems. Environ Earth Sci 75(19):1318

    Article  Google Scholar 

  • Bayat M, Ghorbanpour M, Zare R, Jaafari A, Pham BT (2019) Application of artificial neural networks for predicting tree survival and mortality in the Hyrcanian forest of Iran. Comput Electron Agric 164:104929

    Article  Google Scholar 

  • Chen Y-T, Chen MC (2011) Using chi-square statistics to measure similarities for text categorization. Expert Syst Appl 38(4):3085–3090

    Article  Google Scholar 

  • Chen W, Li YJC (2020) GIS-based evaluation of landslide susceptibility using hybrid computational intelligence models. CATENA 195:104777

    Article  Google Scholar 

  • Chen W, Peng J, Hong H, Shahabi H, Pradhan B, Liu J, Zhu A-X, Pei X, Duan Z (2018a) Landslide susceptibility modelling using GIS-based machine learning techniques for Chongren County, Jiangxi Province, China. Sci Total Environ 626:1121–1135

    Article  Google Scholar 

  • Chen W, Zhang S, Li R, Shahabi H (2018b) Performance evaluation of the GIS-based data mining techniques of best-first decision tree, random forest, and naïve Bayes tree for landslide susceptibility modeling. Sci Total Environ 644:1006–1018

    Article  Google Scholar 

  • Chen W, Yan X, Zhao Z, Hong H, Bui DT, Pradhan B (2019) Spatial prediction of landslide susceptibility using data mining-based kernel logistic regression, naive Bayes and RBFNetwork models for the Long County area (China). Bull Eng Geol Environ 78(1):247–266

    Article  Google Scholar 

  • Conforti M, Pascale S, Robustelli G, Sdao F (2014) Evaluation of prediction capability of the artificial neural networks for mapping landslide susceptibility in the Turbolo River catchment (northern Calabria, Italy). CATENA 113:236–250

    Article  Google Scholar 

  • Conoscenti C, Rotigliano E, Cama M, Caraballo-Arias NA, Lombardo L, Agnesi V (2016) Exploring the effect of absence selection on landslide susceptibility models: a case study in Sicily, Italy. Geomorphology 261:222–235

    Article  Google Scholar 

  • Dao DV, Adeli H, Ly H-B, Le LM, Le VM, Le T-T, Pham BT (2020) A sensitivity and robustness analysis of GPR and ANN for high-performance concrete compressive strength prediction using a Monte Carlo simulation. Sustainability 12(3):830

    Article  Google Scholar 

  • Dikshit A, Sarkar R, Pradhan B, Segoni S, Alamri AM (2020) Rainfall induced landslide studies in Indian Himalayan region: A critical review. Appl Sci 10(7):2466

    Article  Google Scholar 

  • Haque ME, Sudhakar K (2002) ANN back-propagation prediction model for fracture toughness in microalloy steel. Int J Fatigue 24(9):1003–1010

    Article  Google Scholar 

  • Jaafari A, Najafi A, Pourghasemi H, Rezaeian J, Sattarian A (2014) GIS-based frequency ratio and index of entropy models for landslide susceptibility assessment in the Caspian forest, northern Iran. Int J Environ Sci Technol 11(4):909–926

    Article  Google Scholar 

  • Jaafari A, Panahi M, Pham BT, Shahabi H, Bui DT, Rezaie F, Lee SJC (2019) Meta optimization of an adaptive neuro-fuzzy inference system with grey wolf optimizer and biogeography-based optimization algorithms for spatial prediction of landslide susceptibility. Environ Earth Sci 175:430–445

    Google Scholar 

  • Jin X, Xu A, Bie R, Guo P (2006) Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. International workshop on data mining for biomedical applications. Springer, pp 106–115

    Chapter  Google Scholar 

  • Khosravi K, Shahabi H, Pham BT, Adamowski J, Shirzadi A, Pradhan B, Dou J, Ly H-B, Gróf G, Ho HL (2019) A comparative assessment of flood susceptibility modeling using multi-criteria decision-making analysis and machine learning methods. J Hydrol 573:311–323

    Article  Google Scholar 

  • Koller D, Sahami M (1996) Toward optimal feature selection. Stanford InfoLab, Stanford

    Google Scholar 

  • Kumar R, Anbalagan R (2016) Landslide susceptibility mapping using analytical hierarchy process (AHP) in Tehri reservoir rim region, Uttarakhand. J Geol Soc India 87(3):271–286

    Article  Google Scholar 

  • Lu X, Zhou W, Ding X, Shi X, Luan B, Li M (2019) Ensemble learning regression for estimating unconfined compressive strength of cemented paste backfill. In: IEEE access

  • Lucchese LV, de Oliveira GG, Pedrollo OC (2021) Investigation of the influence of nonoccurrence sampling on landslide susceptibility assessment using Artificial Neural Networks. CATENA 198:105067

    Article  Google Scholar 

  • Mao KZ (2004) Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Trans Syst Man Cybern Part B (cybernetics) 34(1):629–634

    Article  Google Scholar 

  • Merghadi A, Yunus AP, Dou J, Whiteley J, ThaiPham B, Bui DT, Avtar R, Abderrahmane B (2020) Machine learning methods for landslide susceptibility studies: a comparative overview of algorithm performance. Earth Sci Rev 207:103225

    Article  Google Scholar 

  • Meyer P, Marbach D, Roy S, Kellis M (2010) Information-theoretic inference of gene networks using backward elimination. In: BioComp. Citeseer, pp 700–705

  • Micheletti N, Foresti L, Robert S, Leuenberger M, Pedrazzini A, Jaboyedoff M, Kanevski M (2014) Machine learning feature selection methods for landslide susceptibility mapping. Math Geosci 46(1):33–57

    Article  Google Scholar 

  • Moh’d A, Mesleh A (2007) Chi square feature extraction based SVMS arabic language text categorization system. J Comput Sci 3(6):430–435

    Article  Google Scholar 

  • Palacio Cordoba J, Mergili M, Aristizábal E (2020) Probabilistic landslide susceptibility analysis in tropical mountainous terrain using the physically based r. slope. stability model. Nat Hazards Earth Syst Sci 20(3):815–829

    Article  Google Scholar 

  • Pham BT, Prakash IJGI (2019) Evaluation and comparison of LogitBoost Ensemble, Fisher’s Linear Discriminant Analysis, logistic regression and support vector machines methods for landslide susceptibility mapping. Geocarto Int 34(3):316–333

    Article  Google Scholar 

  • Pham BT, Jaafari A, Prakash I, Bui DT (2019) A novel hybrid intelligent model of support vector machines and the MultiBoost ensemble for landslide susceptibility modeling. Bull Eng Geol Environ 78(4):2865–2886

    Article  Google Scholar 

  • Pham BT, Nguyen-Thoi T, Ly H-B, Nguyen MD, Al-Ansari N, Tran V-Q, Le T-T (2020) Extreme learning machine based prediction of soil shear strength: a sensitivity analysis using Monte Carlo simulations and feature backward elimination. Sustainability 12(6):2339

    Article  Google Scholar 

  • Pradhan B, Lee S (2010) Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ Earth Sci 60(5):1037–1054

    Article  Google Scholar 

  • Qi C, Ly H-B, Chen Q, Le T-T, Le VM, Pham BT (2020) Flocculation-dewatering prediction of fine mineral tailings using a hybrid machine learning approach. Chemosphere 244:125450

    Article  Google Scholar 

  • Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180:60–91

    Article  Google Scholar 

  • Roy J, Saha S (2021) Integration of artificial intelligence with meta classifiers for the gully erosion susceptibility assessment in Hinglo river basin, Eastern India. Adv Space Res 67(1):316–333

    Article  Google Scholar 

  • Shirani K, Pasandi M, Arabameri A (2018) Landslide susceptibility assessment by Dempster-Shafer and Index of Entropy models, Sarkhoun basin, Southwestern Iran. Nat Hazards 93(3):1379–1418

    Article  Google Scholar 

  • Thabtah F, Eljinini M, Zamzeer M, Hadi W (2009) Naïve Bayesian based on chi square to categorize arabic data. In: Proceedings of the 11th international business information management association conference (IBIMA) conference on innovation and knowledge management in twin track economies, Cairo, Egypt, pp 4–6

  • Thanh DQ, Nguyen DH, Prakash I, Jaafari A, Nguyen V-T, Van Phong T, Pham BT (2020) GIS based frequency ratio method for landslide susceptibility mapping at Da Lat City, Lam Dong province, Vietnam. Vietnam J Earth Sci 42(1):55–66

    Article  Google Scholar 

  • Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016a) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13:361–378. https://doi.org/10.1007/s10346-015-0557-6

    Article  Google Scholar 

  • Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016b) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13(2):361–378

    Article  Google Scholar 

  • Varnes DJ (1958) Landslide types and processes. Landslides Eng Pract 29(3):20–45

    Google Scholar 

  • Vasu NN, Lee S-R (2016) A hybrid feature selection algorithm integrating an extreme learning machine for landslide susceptibility modeling of Mt. Woomyeon, South Korea. Geomorphology 263:50–70

    Article  Google Scholar 

  • Wang F, Xu P, Wang C, Wang N, Jiang N (2017) Application of a GIS-based slope unit method for landslide susceptibility mapping along the Longzi River, Southeastern Tibetan Plateau, China. ISPRS Int J Geo-Inf 6(6):172

    Article  Google Scholar 

  • Yilmaz I (2010) Comparison of landslide susceptibility mapping methodologies for Koyulhisar, Turkey: conditional probability, logistic regression, artificial neural networks, and support vector machine. Environ Earth Sci 61(4):821–836

    Article  Google Scholar 

  • Yusoff Y, Zain AM, Sharif S, Sallehuddin R, Ngadiman MS (2018) Potential ANN prediction model for multiperformances WEDM on Inconel 718. Neural Comput Appl 30(7):2113–2127

    Article  Google Scholar 

  • Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on imbalanced data. ACM SIGKDD Explor Newsl 6(1):80–89

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Binh Thai Pham.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pham, B.T., Van Dao, D., Acharya, T.D. et al. Performance assessment of artificial neural network using chi-square and backward elimination feature selection methods for landslide susceptibility analysis. Environ Earth Sci 80, 686 (2021). https://doi.org/10.1007/s12665-021-09998-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12665-021-09998-5

Keywords

Navigation