Abstract
Identifying areas prone to flooding is a key step in flood risk management. The purpose of this study is to develop and present a novel flood susceptibility model based on Bayesian Additive Regression Tree (BART) methodology. The predictive performance of the new model is assessed via comparison with the Naïve Bayes (NB) and Random Forest (RF) based methods that were previously published in the literature. All models were tested on a real case study based in the Kan watershed in Iran. The following fifteen climatic and geo-environmental variables were used as inputs into all flood susceptibility models: altitude, aspect, slope, plan curvature, profile curvature, drainage density, distance from river distance from road, stream power index (SPI), topographic wetness index (TPI), topographic position index (TPI), curve number (CN), land use, lithology and rainfall. Based on the existing flood field survey and other information available for the analyzed area, a total of 118 flood locations were identified as potentially prone to flooding. The data available were divided into two groups with 70% used for training and 30% for validation of all models. The receiver operating characteristic (ROC) curve parameters were used to evaluate the predictive accuracy of the new and existing models. Based on the area under curve (AUC) the new BART (86%) model outperformed the NB (80%) and RF (85%) models. Regarding the importance of input variables, the results obtained showed that the location’s altitude and distance from the river are the most important variables for assessing flooding susceptibility.
Similar content being viewed by others
Availability of Data and Materials
We have no permission to release data and codes.
References
Ahmadlou M, Karimi M, Alizadeh S et al (2019) Flood susceptibility assessment using integration of adaptive network-based fuzzy inference system (ANFIS) and biogeography-based optimization (BBO) and BAT algorithms (BA). Geocarto Int 34:1252–1272
Ahmadi K, Kalantar B, Saeidi V, Harandi EK, Janizadeh S, Ueda N (2020) Comparison of Machine Learning Methods for Mapping the Stand Characteristics of Temperate Forests Using Multi-Spectral Sentinel-2 Data. Remote Sens. 12(18):3019
Al-Abadi AM (2018) Mapping flood susceptibility in an arid region of southern Iraq using ensemble machine learning classifiers: a comparative study. Arab J Geosci 11:218
Al-Juaidi AEM, Nassar AM, Al-Juaidi OEM (2018) Evaluation of flood susceptibility mapping using logistic regression and GIS conditioning factors. Arab J Geosci 11:765
Arabameri A, Saha S, Chen W et al (2020) Flash flood susceptibility modelling using functional tree and hybrid ensemble techniques. J Hydrol 125007
Benesty J, Chen J, Huang Y, Cohen I (2009) Pearson correlation coefficient. In: Noise reduction in speech processing. Springer, pp 1–4
Bui DT, Panahi M, Shahabi H et al (2018) Novel hybrid evolutionary algorithms for spatial prediction of floods. Sci Rep 8:15364
Chapi K, Singh VP, Shirzadi A et al (2017) A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environ Model Softw 95:229–245
Chen W, Li Y, Xue W et al (2020) Modeling flood susceptibility using data-driven approaches of naïve bayes tree, alternating decision tree, and random forest methods. Sci Total Environ 701:134979
Choubin B, Moradi E, Golshan M et al (2019) An ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci Total Environ 651:2087–2096
Chowdhuri I, Pal SC, Arabameri A et al (2020) Implementation of artificial intelligence based ensemble models for gully erosion susceptibility assessment. Remote Sens 12:3620
Cook A, Merwade V (2009) Effect of topographic data, geometric configuration and modeling approach on flood inundation mapping. J Hydrol 377:131–142
Costache R (2019) Flash-flood Potential Index mapping using weights of evidence, decision Trees models and their novel hybrid integration. Stoch Environ Res Risk Assess 33:1375–1402
Costache R, Arabameri A, Blaschke T et al (2021) Flash-flood potential mapping using deep learning, alternating decision trees and data provided by remote sensing sensors. Sensors 21:280. https://doi.org/10.3390/s21010280
Costache R, Bui DT (2020) Identification of areas prone to flash-flood phenomena using multiple-criteria decision-making, bivariate statistics, machine learning and their ensembles. Sci Total Environ 712:136492
Darabi H, Choubin B, Rahmati O et al (2019) Urban flood risk mapping using the GARP and QUEST models: A comparative study of machine learning techniques. J Hydrol 569:142–154
Delkash M, Al-Faraj FAM, Scholz M (2014) Comparing the export coefficient approach with the soil and water assessment tool to predict phosphorous pollution: the Kan watershed case study. Water Air Soil Pollut 225:2122
Dormann CF, Elith J, Bacher S et al (2013) Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography (cop) 36:27–46
El-Magd SAA, Pradhan B, Alamri A (2021) Machine learning algorithm for flash flood prediction mapping in Wadi El-Laqeita and surroundings, Central Eastern Desert. Egypt Arab J Geosci 14:1–14
Frattini P, Crosta G, Carrara A (2010) Techniques for evaluating the performance of landslide susceptibility models. Eng Geol 111:62–72
Heidari A (2014) Flood vulnerability of the K arun R iver S ystem and short-term mitigation measures. J Flood Risk Manag 7:65–80
Hill J, Linero A, Murray J (2020) Bayesian additive regression trees: A review and look forward. Annu Rev Stat Its Appl 7:251–278
Hong H, Panahi M, Shirzadi A et al (2018) Flood susceptibility assessment in Hengfeng area coupling adaptive neuro-fuzzy inference system with genetic algorithm and differential evolution. Sci Total Environ 621:1124–1141
Hooshyaripor F, Faraji-Ashkavar S, Koohyian F et al (2020) Annual flood damage influenced by El Niño in the Kan River basin. Iran Nat Hazards Earth Syst Sci 20:2739–2751
Hosseini FS, Choubin B, Mosavi A et al (2020) Flash-flood hazard assessment using ensembles and Bayesian-based machine learning models: application of the simulated annealing feature selection method. Sci Total Environ 711:135161
Janizadeh S, Avand M, Jaafari A et al (2019) Prediction success of machine learning methods for flash flood susceptibility mapping in the Tafresh watershed. Iran Sustainability 11:5426
Kalantar B, Ueda N, Saeidi V et al (2021) Deep Neural Network Utilizing Remote Sensing Datasets for Flood Hazard Susceptibility Mapping in Brisbane. Australia Remote Sens 13:2638
Kapelner A, Bleich J (2013) bartMachine: Machine learning with Bayesian additive regression trees. arXiv Prepr https://arxiv.org/abs/1312.2171
Khosravi K, Pham BT, Chapi K et al (2018) A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Sci Total Environ 627:744–755
Khosravi K, Pourghasemi HR, Chapi K, Bahri M (2016) Flash flood susceptibility analysis and its mapping using different bivariate models in Iran: a comparison between Shannon’s entropy, statistical index, and weighting factor models. Environ Monit Assess 188:656
Khosravi K, Shahabi H, Pham BT et al (2019) A comparative assessment of flood susceptibility modeling using multi-criteria decision-making analysis and machine learning methods. J Hydrol 573:311–323
Liaw A, Wiener M et al (2002) Classification and regression by randomForest. R News 2:18–22
Liu R, Chen Y, Wu J et al (2016) Assessing spatial likelihood of flooding hazard using naïve Bayes and GIS: a case study in Bowen Basin, Australia. Stoch Environ Res Risk Assess 30:1575–1590
Mahmoud SH, Gan TY (2018) Multi-criteria approach to develop flood susceptibility maps in arid regions of Middle East. J Clean Prod 196:216–229
Miles J (2014) Tolerance and variance inflation factor. Wiley StatsRef Stat Ref Online
Molinos-Senante M, Hernández-Sancho F, Sala-Garrido R (2011) Cost–benefit analysis of water-reuse projects for environmental purposes: A case study for Spanish wastewater treatment plants. J Environ Manage 92:3091–3097
Nahler G (2009) Pearson correlation coefficient. In: Dictionary of Pharmaceutical Medicine. Springer, p 132
Ngo PT, Hoang ND, Pradhan B et al (2018) A Novel Hybrid Swarm Optimized Multilayer Neural Network for Spatial Prediction of Flash Floods in Tropical Areas Using Sentinel-1 SAR Imagery and Geospatial Data. Sensors 18:3704. https://doi.org/10.3390/s18113704
Panahi M, Dodangeh E, Rezaie F et al (2021) Flood spatial prediction modeling using a hybrid of meta-optimization and support vector regression modeling. Catena 199:105114
Papaioannou G, Vasiliades L, Loukas A (2015) Multi-criteria analysis framework for potential flood prone areas mapping. Water Resour Manag 29:399–418
Pham BT, Avand M, Janizadeh S et al (2020a) GIS based hybrid computational approaches for flash flood susceptibility assessment. Water 12:683
Pham BT, Van PT, Nguyen HD et al (2020b) A comparative study of kernel logistic regression, radial basis function classifier, multinomial naïve bayes, and logistic model tree for flash flood susceptibility mapping. Water 12:239
Plant E, King R, Kath J (2021) Statistical comparison of additive regression tree methods on ecological grassland data. Ecol Inform 61:101198
Prado EB, Moral RA, Parnell AC (2021) Bayesian additive regression trees with model trees. Stat Comput 31:1–13
Pratola MT, Higdon DM (2016) Bayesian additive regression tree calibration of complex high-dimensional computer models. Technometrics 58:166–179
Rahmati O, Pourghasemi HR, Zeinivand H (2016) Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province. Iran Geocarto Int 31:42–70
Rish I et al (2001) An empirical study of the naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence. pp 41–46
Sayers W, Savić DRAGAN, Kapelan Z, Kellagher R (2014) Artificial intelligence techniques for flood risk management in urban environments. Procedia Eng. 70:1505-1512
Shafapour Tehrany M, Shabani F, Neamah Jebur M et al (2017) GIS-based spatial prediction of flood prone areas using standalone frequency ratio, logistic regression, weight of evidence and their ensemble techniques. Geomatics, Nat Hazards Risk 8:1538–1561
Shafizadeh-Moghadam H, Valavi R, Shahabi H et al (2018) Novel forecasting approaches using combination of machine learning and statistical models for flood susceptibility mapping. J Environ Manage 217:1–11
Shahabi H, Shirzadi A, Ghaderi K et al (2020) Flood detection and susceptibility mapping using sentinel-1 remote sensing data and a machine learning approach: Hybrid intelligence of bagging ensemble based on k-nearest neighbor classifier. Remote Sens 12:266
Sparapani R, Spanbauer C, McCulloch R (2021) Nonparametric machine learning and efficient computation with bayesian additive regression trees: the BART R package. J Stat Softw 97:1–66
Sparapani RA, Logan BR, McCulloch RE, Laud PW (2016) Nonparametric survival analysis using Bayesian additive regression trees (BART). Stat Med 35:2741–2753
Talukdar S, Ghose B, Salam R et al (2020) Flood susceptibility modeling in Teesta River basin, Bangladesh using novel ensembles of bagging algorithms. Stoch Environ Res Risk Assess 34:2277–2300
Tang X, Li J, Liu M et al (2020) Flood susceptibility assessment based on a novel random naïve Bayes method: A comparison between different factor discretization methods. Catena 190:104536
Tang Z, Yi S, Wang C, Xiao Y (2018) Incorporating probabilistic approach into local multi-criteria decision analysis for flood susceptibility assessment. Stoch Environ Res Risk Assess 32:701–714
Tehrany MS, Jones S, Shabani F (2019a) Identifying the essential flood conditioning factors for flood prone area mapping using machine learning techniques. CATENA 175:174–192
Tehrany MS, Kumar L (2018) The application of a Dempster–Shafer-based evidential belief function in flood susceptibility mapping and comparison with frequency ratio and logistic regression methods. Environ Earth Sci 77:490
Tehrany MS, Kumar L, Shabani F (2019b) A novel GIS-based ensemble technique for flood susceptibility mapping using evidential belief function and support vector machine: Brisbane, Australia. PeerJ 7:e7653
Tehrany MS, Pradhan B, Jebur MN (2014) Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS. J Hydrol 512:332–343
Tehrany MS, Pradhan B, Mansor S, Ahmad N (2015) Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. CATENA 125:91–101
Vafakhah M, Loor SMH, Pourghasemi H, Katebikord A (2020) Comparing performance of random forest and adaptive neuro-fuzzy inference system data mining models for flood susceptibility mapping. Arab J Geosci 13:417
Vetrivel A, Gerke M, Kerle N et al (2018) Disaster damage detection through synergistic use of deep learning and 3D point cloud features derived from very high resolution oblique aerial images, and multiple-kernel-learning. ISPRS J Photogramm Remote Sens 140:45–59
Wang Z, Lai C, Chen X et al (2015) Flood hazard risk assessment model based on random forest. J Hydrol 527:1130–1141
Woodward M, Kapelan Z, Gouldby B (2014) Adaptive flood risk management under climate change uncertainty using real options and optimization. Risk Anal 34:75–92
Wu W, Tang X, Lv J et al (2021) Potential of Bayesian additive regression trees for predicting daily global and diffuse solar radiation in arid and humid areas. Renew Energy
Yariyan P, Janizadeh S, Van Phong T et al (2020) Improvement of best first decision trees using bagging and dagging ensembles for flood probability mapping. Water Resour Manag 1–17
Yesilnacar E, Topal T (2005) Landslide susceptibility mapping: a comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng Geol 79:251–266
Zhang H (2004) The optimality of naive Bayes. Am Assoc Artif Intell. www.aaai.org
Zhao G, Pang B, Xu Z et al (2019) Assessment of urban flood susceptibility using semi-supervised machine learning model. Sci Total Environ 659:940–949
Acknowledgements
We acknowledge Tarbiat Modares University's support for this work.
Funding
The authors received no specific funding for this work.
Author information
Authors and Affiliations
Contributions
Saeid Janizadeh acquired the data; Saeid Janizadeh and Mehdi Vafakhah conceptualized and performed the analysis; Saeid Janizadeh wrote the manuscript and discussion, and analyzed the data; Mehdi Vafakhah, Zoran Kapelan and Naghmeh Mobarghaee Dinan provided technical sights, as well as edited, restructured, and professionally optimized the manuscript. All authors discussed the results and edited the manuscript. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Ethical Approval
We confirm that this manuscript has not been published elsewhere and is not under consideration by another journal.
Consent to Participate
All authors have participated the manuscript and agree with submission to Water Resources Management.
Consent to Publish
All authors have approved the publication of this manuscript in the Water Resources Management Journal.
Conflicts of Interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Janizadeh, S., Vafakhah, M., Kapelan, Z. et al. Novel Bayesian Additive Regression Tree Methodology for Flood Susceptibility Modeling. Water Resour Manage 35, 4621–4646 (2021). https://doi.org/10.1007/s11269-021-02972-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-021-02972-7