Abstract
In the machine learning models, it is desirable to remove most redundant features from the data set to reduce the data processing time and to improve accuracy of the models. In this paper, chi-square (CS) and backward elimination (BE), which are well-known feature selection methods, were used for the optimum selection of input features/factors for training artificial neural network (ANN) for landslide susceptibility modeling. Initially, seventeen landslide affecting factors were considered for the ANN model which were reduced to twelve and eleven based on the ANN optimized by CS (CSANN) and BE (BEANN), respectively. Accuracy (ACC), Kappa Index, root mean square error (RMSE), and area under the receiver operating characteristic (AUROC) curve were used to evaluate and validate performance of the models. Results show that both the feature selection methods (CS and BE) improved significantly performance of the hybrid BEANN and CSANN models in comparison to single ANN model. Results indicated that performance of the BEANN model (AUROC 0.963; ACC 91.31) is the best in comparison to CSANN (AUROC 0.950; ACC 89.80) and ANN (AUROC 0.949; ACC 76.40) models in the accurate prediction of landslide susceptible areas/zones. Therefore, it is reasonable to state that the BE is more effective feature selection method than the CS in improving performance of the ANN model and thus, it can be used for better landslide susceptibility analysis for the landslide management of the area.
Similar content being viewed by others
References
Abedini M, Ghasemian B, Shirzadi A, Bui DT (2019) A comparative study of support vector machine and logistic model tree classifiers for shallow landslide susceptibility modeling. Environ Earth Sci 78(18):560
Acharya TD (2018) Regional scale landslide hazard assessment using machine learning methods in Nepal (Doctoral dissertation). Kangwon National University, Chuncheon, Retrieved from KERIS-RISS. http://www.riss.kr/link?id=T14734504. Accessed 30 Sept 2020
Al-Jarrah O, Siddiqui A, Elsalamouny M, Yoo PD, Muhaidat S, Kim K (2014) Machine-learning-based feature selection techniques for large-scale network intrusion detection. In: 2014 IEEE 34th international conference on distributed computing systems workshops (ICDCSW). IEEE, pp 177–181
Alshalabi H, Tiun S, Omar N, Albared MJPT (2013) Experiments on the use of feature selection and machine learning methods in automatic malay text categorization. Proc Technol 11:748–754
Al-Subaihi AA (2002) Variable selection in multivariable regression using SAS/IML. J Stat Softw 7(12):1–20
Arabameri A, Saha S, Roy J, Chen W, Blaschke T, Tien Bui D (2020) Landslide susceptibility evaluation and management using different machine learning methods in the Gallicash River Watershed, Iran. Remote Sens 12(3):475
Armaş I (2012) Weights of evidence method for landslide susceptibility mapping, Prahova Subcarpathians, Romania. Nat Hazards 60(3):937–950
Baeza C, Lantada N, Amorim S (2016) Statistical and spatial analysis of landslide susceptibility maps with different classification systems. Environ Earth Sci 75(19):1318
Bayat M, Ghorbanpour M, Zare R, Jaafari A, Pham BT (2019) Application of artificial neural networks for predicting tree survival and mortality in the Hyrcanian forest of Iran. Comput Electron Agric 164:104929
Chen Y-T, Chen MC (2011) Using chi-square statistics to measure similarities for text categorization. Expert Syst Appl 38(4):3085–3090
Chen W, Li YJC (2020) GIS-based evaluation of landslide susceptibility using hybrid computational intelligence models. CATENA 195:104777
Chen W, Peng J, Hong H, Shahabi H, Pradhan B, Liu J, Zhu A-X, Pei X, Duan Z (2018a) Landslide susceptibility modelling using GIS-based machine learning techniques for Chongren County, Jiangxi Province, China. Sci Total Environ 626:1121–1135
Chen W, Zhang S, Li R, Shahabi H (2018b) Performance evaluation of the GIS-based data mining techniques of best-first decision tree, random forest, and naïve Bayes tree for landslide susceptibility modeling. Sci Total Environ 644:1006–1018
Chen W, Yan X, Zhao Z, Hong H, Bui DT, Pradhan B (2019) Spatial prediction of landslide susceptibility using data mining-based kernel logistic regression, naive Bayes and RBFNetwork models for the Long County area (China). Bull Eng Geol Environ 78(1):247–266
Conforti M, Pascale S, Robustelli G, Sdao F (2014) Evaluation of prediction capability of the artificial neural networks for mapping landslide susceptibility in the Turbolo River catchment (northern Calabria, Italy). CATENA 113:236–250
Conoscenti C, Rotigliano E, Cama M, Caraballo-Arias NA, Lombardo L, Agnesi V (2016) Exploring the effect of absence selection on landslide susceptibility models: a case study in Sicily, Italy. Geomorphology 261:222–235
Dao DV, Adeli H, Ly H-B, Le LM, Le VM, Le T-T, Pham BT (2020) A sensitivity and robustness analysis of GPR and ANN for high-performance concrete compressive strength prediction using a Monte Carlo simulation. Sustainability 12(3):830
Dikshit A, Sarkar R, Pradhan B, Segoni S, Alamri AM (2020) Rainfall induced landslide studies in Indian Himalayan region: A critical review. Appl Sci 10(7):2466
Haque ME, Sudhakar K (2002) ANN back-propagation prediction model for fracture toughness in microalloy steel. Int J Fatigue 24(9):1003–1010
Jaafari A, Najafi A, Pourghasemi H, Rezaeian J, Sattarian A (2014) GIS-based frequency ratio and index of entropy models for landslide susceptibility assessment in the Caspian forest, northern Iran. Int J Environ Sci Technol 11(4):909–926
Jaafari A, Panahi M, Pham BT, Shahabi H, Bui DT, Rezaie F, Lee SJC (2019) Meta optimization of an adaptive neuro-fuzzy inference system with grey wolf optimizer and biogeography-based optimization algorithms for spatial prediction of landslide susceptibility. Environ Earth Sci 175:430–445
Jin X, Xu A, Bie R, Guo P (2006) Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. International workshop on data mining for biomedical applications. Springer, pp 106–115
Khosravi K, Shahabi H, Pham BT, Adamowski J, Shirzadi A, Pradhan B, Dou J, Ly H-B, Gróf G, Ho HL (2019) A comparative assessment of flood susceptibility modeling using multi-criteria decision-making analysis and machine learning methods. J Hydrol 573:311–323
Koller D, Sahami M (1996) Toward optimal feature selection. Stanford InfoLab, Stanford
Kumar R, Anbalagan R (2016) Landslide susceptibility mapping using analytical hierarchy process (AHP) in Tehri reservoir rim region, Uttarakhand. J Geol Soc India 87(3):271–286
Lu X, Zhou W, Ding X, Shi X, Luan B, Li M (2019) Ensemble learning regression for estimating unconfined compressive strength of cemented paste backfill. In: IEEE access
Lucchese LV, de Oliveira GG, Pedrollo OC (2021) Investigation of the influence of nonoccurrence sampling on landslide susceptibility assessment using Artificial Neural Networks. CATENA 198:105067
Mao KZ (2004) Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Trans Syst Man Cybern Part B (cybernetics) 34(1):629–634
Merghadi A, Yunus AP, Dou J, Whiteley J, ThaiPham B, Bui DT, Avtar R, Abderrahmane B (2020) Machine learning methods for landslide susceptibility studies: a comparative overview of algorithm performance. Earth Sci Rev 207:103225
Meyer P, Marbach D, Roy S, Kellis M (2010) Information-theoretic inference of gene networks using backward elimination. In: BioComp. Citeseer, pp 700–705
Micheletti N, Foresti L, Robert S, Leuenberger M, Pedrazzini A, Jaboyedoff M, Kanevski M (2014) Machine learning feature selection methods for landslide susceptibility mapping. Math Geosci 46(1):33–57
Moh’d A, Mesleh A (2007) Chi square feature extraction based SVMS arabic language text categorization system. J Comput Sci 3(6):430–435
Palacio Cordoba J, Mergili M, Aristizábal E (2020) Probabilistic landslide susceptibility analysis in tropical mountainous terrain using the physically based r. slope. stability model. Nat Hazards Earth Syst Sci 20(3):815–829
Pham BT, Prakash IJGI (2019) Evaluation and comparison of LogitBoost Ensemble, Fisher’s Linear Discriminant Analysis, logistic regression and support vector machines methods for landslide susceptibility mapping. Geocarto Int 34(3):316–333
Pham BT, Jaafari A, Prakash I, Bui DT (2019) A novel hybrid intelligent model of support vector machines and the MultiBoost ensemble for landslide susceptibility modeling. Bull Eng Geol Environ 78(4):2865–2886
Pham BT, Nguyen-Thoi T, Ly H-B, Nguyen MD, Al-Ansari N, Tran V-Q, Le T-T (2020) Extreme learning machine based prediction of soil shear strength: a sensitivity analysis using Monte Carlo simulations and feature backward elimination. Sustainability 12(6):2339
Pradhan B, Lee S (2010) Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ Earth Sci 60(5):1037–1054
Qi C, Ly H-B, Chen Q, Le T-T, Le VM, Pham BT (2020) Flocculation-dewatering prediction of fine mineral tailings using a hybrid machine learning approach. Chemosphere 244:125450
Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180:60–91
Roy J, Saha S (2021) Integration of artificial intelligence with meta classifiers for the gully erosion susceptibility assessment in Hinglo river basin, Eastern India. Adv Space Res 67(1):316–333
Shirani K, Pasandi M, Arabameri A (2018) Landslide susceptibility assessment by Dempster-Shafer and Index of Entropy models, Sarkhoun basin, Southwestern Iran. Nat Hazards 93(3):1379–1418
Thabtah F, Eljinini M, Zamzeer M, Hadi W (2009) Naïve Bayesian based on chi square to categorize arabic data. In: Proceedings of the 11th international business information management association conference (IBIMA) conference on innovation and knowledge management in twin track economies, Cairo, Egypt, pp 4–6
Thanh DQ, Nguyen DH, Prakash I, Jaafari A, Nguyen V-T, Van Phong T, Pham BT (2020) GIS based frequency ratio method for landslide susceptibility mapping at Da Lat City, Lam Dong province, Vietnam. Vietnam J Earth Sci 42(1):55–66
Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016a) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13:361–378. https://doi.org/10.1007/s10346-015-0557-6
Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016b) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13(2):361–378
Varnes DJ (1958) Landslide types and processes. Landslides Eng Pract 29(3):20–45
Vasu NN, Lee S-R (2016) A hybrid feature selection algorithm integrating an extreme learning machine for landslide susceptibility modeling of Mt. Woomyeon, South Korea. Geomorphology 263:50–70
Wang F, Xu P, Wang C, Wang N, Jiang N (2017) Application of a GIS-based slope unit method for landslide susceptibility mapping along the Longzi River, Southeastern Tibetan Plateau, China. ISPRS Int J Geo-Inf 6(6):172
Yilmaz I (2010) Comparison of landslide susceptibility mapping methodologies for Koyulhisar, Turkey: conditional probability, logistic regression, artificial neural networks, and support vector machine. Environ Earth Sci 61(4):821–836
Yusoff Y, Zain AM, Sharif S, Sallehuddin R, Ngadiman MS (2018) Potential ANN prediction model for multiperformances WEDM on Inconel 718. Neural Comput Appl 30(7):2113–2127
Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on imbalanced data. ACM SIGKDD Explor Newsl 6(1):80–89
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pham, B.T., Van Dao, D., Acharya, T.D. et al. Performance assessment of artificial neural network using chi-square and backward elimination feature selection methods for landslide susceptibility analysis. Environ Earth Sci 80, 686 (2021). https://doi.org/10.1007/s12665-021-09998-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-021-09998-5