Abstract
Rapid urban development, increasing impermeable surfaces, poor drainage system and changes in extreme precipitations are the most important factors that nowadays lead to increased urban flooding and it has become an urban problem. Urban flood mapping and its use in making an urban development plan can reduce flood damages and losses. Constantly producing urban flood hazard maps using models that rely on the availability of detailed hydraulic-hydrological data is a major challenge especially in developing countries. In this study, urban flood hazard map was produced with limited data using three machine learning models: Genetic Algorithm Rule-Set Production, Maximum Entropy (MaxEnt), Random Forest (RF) and Naïve Bayes for Kermanshah city, Iran. The flood hazard predicting factors used in modeling were: slope, land use, precipitation, distance to river, distance to channel, curve number (CN) and elevation. Flood inventory map was produced based on available reports and field surveys, that 117 flooded points and 163 non-flooded points were identified. Models performance was evaluated based on area under the receiver-operator characteristic curve (AUC-ROC), Kappa statistic and hits and miss analysis. The results show that RF model (AUC-ROC = 99.5%, Kappa = 98%, Accuracy = 90%, Success ratio = 99%, Threat score = 90% and Heidke skill score = 98%) performed better than other models. The results also showed that distance to canal, land use and CN have shown more contribution among others for modeling the flood and precipitation had the least effect among other factors. The findings show that machine learning methods can be a good alternative to distributed models to predict urban flood-prone areas where there are lack of detailed hydraulic and hydrological data.
Similar content being viewed by others
References
Adelabu S, Mutanga O, Adam E (2015) Testing the reliability and stability of the internal accuracy assessment of random forest for classifying tree defoliation levels using different validation methods. Geocarto International 30(7):810–821
Agrawal D, Singh JK, Kumar A (2005) Maximum entropy-based conditional probability distribution runoff model. Biosys Eng 90(1):103–113
Al-Abadi AM (2018) Mapping flood susceptibility in an arid region of southern Iraq using ensemble machine learning classifiers: a comparative study. Arabian J Geosci 11(9):218
Alfieri L, Feyen L, Dottori F, Bianchi A (2015) Ensemble flood risk assessment in Europe under high end climate scenarios. Global Environ Change 35:199–212
Anderson RP, Lew D, Peterson AT (2003) Evaluating predictive models of species’ distributions: criteria for selecting optimal models. Ecol Model 162(3):211–232
Becknell BR, Imhoff JC, Kittle JL, Donigian AS, Johanson RC, (1993). Hydrological simulation program: FORTRAN. User's manual for release 10 (No. PB-94–114865/XAB). AQUA TERRA Consultants, Mountain View, CA (United States).
Bentlage B, Peterson AT, Cartwright P (2009) Inferring distributions of chirodropid box-jellyfishes (Cnidaria: Cubozoa) in geographic and ecological space using ecological niche modeling. Mar Ecol Prog Ser 384:121–133
Bevan A, Wilson A (2013) Models of settlement hierarchy based on partial evidence. J. Archaeol. Sci. 40(5):2415–2427
Boeckmann M, Joyner TA (2014) Old health risks in new places? an ecological niche model for I. ricinus tick distribution in Europe under a changing climate. Health Place 30:70–77
Booker DJ, Snelder TH (2012) Comparing methods for estimating flow duration curves at ungauged sites. J Hydrol 435:78–94
Chapi K, Singh VP, Shirzadi A, Shahabi H, Bui DT, Pham BT, Khosravi K (2017) A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environ Modell Software 95:229–245
Chen J, Hill AA, Urbano LD (2009) A GIS-based model for urban flood inundation. J Hydrol 373(1):184–192
Chen W, Li Y, Xue W, Shahabi H, Li S, Hong H, Wang X, Bian H, Zhang S, Pradhan B, Ahmad BB (2020) Modeling flood susceptibility using data-driven approaches of naïve bayes tree, alternating decision tree, and random forest methods. Sci Total Environ 701:134979
Chen W, Xie X, Wang J, Pradhan B, Hong H, Bui DT, Duan Z, Ma J (2017) A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160
Choubin B, Borji M, Mosavi A, Sajedi-Hosseini F, Singh VP, Shamshirband S (2019) Snow avalanche hazard prediction using machine learning methods. J Hydrol 577:123929
Choubin B, Moradi E, Golshan M, Adamowski J, Sajedi-Hosseini F, Mosavi A (2019) An Ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci Total Environ 651(441):2087–2096
Cooper HH, Jacob CE (1946) A generalized graphical method for evaluation formation constants and summarizing well field history. Trans Am Geophys Union 27:526–534
Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT (2007) Random forests for classification in ecology. J Ecol 88(11):2783–2792
Darabi H, Choubin B, Rahmati O, Haghighi AT, Pradhan B, Kløve B (2019) Urban flood risk mapping using the GARP and QUEST models: a comparative study of machine learning techniques. J Hydrol 569:142–154
Darabi H, Haghighi AT, Mohamadi MA, Rashidpour M, Ziegler AD, Hekmatzadeh AA, Kløve B (2020) Urban flood risk mapping using data-driven geospatial techniques for a flood-prone case area in Iran. Hydrol Res 51(1):127–142
Davies T, Fry H, Wilson A, Palmisano A, Altaweel M, Radner K (2014) Application of an entropy maximizing and dynamics model for understanding settlement structure the Khabur Triangle in the Middle Bronze and Iron Ages. J Archaeol Sci 43:141–154
Dormann CF et al (2013) Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36(1):27–46
Eini M, Seyed Kaboli H, Rashidian M, Hedayat H (2020) Hazard and vulnerability in urban flood risk mapping: Machine learning techniques and considering the role of urban districts. Int J Disaster Risk Reduct 50:101687
Fernández DS, Lutz MA (2010) Urban flood hazard zoning in Tucumán Province, Argentina, using GIS and multicriteria decision analysis. Eng Geol 111(1–4):90–98
Fithian W, Hastie T (2013) Finite-sample equivalence in statistical models for presence-only data. The Annals of Applied Statistics. https://doi.org/10.1214/13-AOAS667
Fitzpatrick MC, Weltzin JF, Sanders NJ, Dunn RR (2007) The biogeography prediction error: why does the introduced range of the fire ant over-predict its native range? Global Ecol Biogeogr 16(1):24–33
Giovannettone J, Copenhaver T, Burns M, Choquette S (2018) A Statistical approach to mapping flood susceptibility in the lower connecticut river valley region. Water Resour Res 54:7603–7618
Gorsevski PV, Gessler PE, Foltz RB, Elliot WJ (2006) Spatial prediction of landslide hazard using logistic regression and ROC analysis. T GIS 10(3):395–415
Harte J (2011) Maximum entropy and ecology: A theory of abundance, distribution, and energetics. Oxford University Press, New York
Ho, T.K. 1995. Random decision forests C3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. IEEE Computer Society, 278–282.
Hong H, Panahi M, Shirzadi A, Ma T, Liu J, Zhu AX, Chen W, Kougias I, Kazakis N (2018) Flood susceptibility assessment in Hengfeng area coupling adaptive neuro-fuzzy inference system with genetic algorithm and differential evolution. Sci Total Environ 621:1124–1141
Hosseini FS, Choubin B, Mosavi A, Nabipour N, Shamshirband S, Darabi H, Haghighi AT (2020) Flash-flood hazard assessment using ensembles and Bayesian-based machine learning models: application of the simulated annealing feature selection method. Sci Total Environ 711:135161
Howey MC, Palace MW, McMichael CH (2016) Geospatial modeling approach to monument construction using Michigan from AD 1000-1600 as a case study. Proc Natl Acad Sci Unit States Am 113(27):7443–7448
Jha R, Singh VP (2008) Evaluation of river water quality by entropy. KSCE J civil Eng 12(1):16–69
Jiménez-Valverde A (2012) Insights into the area under the receiver operating characteristic curve (AUC) as a discrimination measure in species distribution modelling. Global Ecol Biogeogr 21(4):498–507
Kanani-Sadat Y, Arabsheibani R, Karimipour F, Nasseri M (2019) A new approach to flood susceptibility assessment in data-scarce and ungauged regions based on GIS-based hybrid multi criteria decision-making method. J Hydrol 572:17–31
Kannan G, Pokharel S, Kumar PS (2009) A hybrid approach using ISM and fuzzy TOPSIS for selection of reverse logistics provider. Resour Conserv Recycl 54(1):28–36
Khosravi K, Pham BT, Chapi K, Shirzadi A, Shhabi H, Revhaug I, Prakkash I, Bui DT (2019) A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Sci Total Environ 627:744–755
Khosravi K, Shhabi H, Pham BT, Adamowski J, Shirzadi A, Pradhan B, Dou J, Ly H, Grof G, Ho HL, Hong H, Chapi K, Prakash I (2019) A comparative assessment of flood susceptibility using multi-criteria decision-making analysis and machine learning methods. J Hydrol 573:311–323
Knighton J, Buchanan B, Guzman C, Elliott R, White E, Rahm B (2020) Predicting flood insurance claims with hydrologic and socioeconomic demographics via machine learning: exploring the roles of topography, minority populations, and political dissimilarity. J Environ Manage 272:111051
Kumar S, Stohlgren TJ (2009) Maxent modeling for predicting suitable habitat for threatened and endangered tree Canacomyrica monticola in New Caledonia. J Ecology Nat Environ 1(4):94–98
Laudan J, Rözer V, Sieg T, Vogel K, Thieken AH (2017) Damage assessment in Braunsbach 2016: Data collection and analysis for an improved understanding damaging processes during flash floods. Nat Hazards Earth Sys Sci 17:2163–2179
Lee S, Kim JC, Jung HS, Lee MJ, Lee S (2017) Spatial prediction of flood susceptibility using random-forest and boosted-tree models in Seoul metropolitan city, Korea. Geomatics, Nat Hazards Risk 8:1185–1203
Mahmood S, Rahman A (2019a) Flash flood susceptibility modeling using geo-morphometric and hydrological approaches in Panjkora Basin Pakistan, Eastern Hindu Kush. Environ earth sci 78(1):43
Mahmood S, Rahman AU, Shaw R (2019b) Spatial appraisal of flood risk assessment and evaluation using integrated hydro-probabilistic approach in Panjkora River Basin Pakistan. Environ Monitoring Assessment 191(9):573
McCallum A, Nigam K (1998) A comparison of event models for Naive Bayes text classification. AAAI-98 workshop on learning for text categorization 752:41–48
McNyset KM, Blackburn JK (2006) Does GARP really fail miserably. Diversity 12:782–786
Merckx B, Steyaert M, Vanreusel A, Vincx M, Vanaverbeke J (2011) Null models reveal preferential sampling, spatial autocorrelation and overfitting in habitat suitability modelling. Ecol Model 222(3):588–597
Mishra AK, Coulibaly P (2010) Hydrometric network evaluation for Canadian watersheds. J hydrol 380:420–437
Monserud RA, Leemans R (1992) Comparing global vegetation maps with the Kappa statistic. Ecol model 62(4):275–293
MontesarchioV, Napolitano F (2010) A single-site rainfall disaggregation model based on entropy. international workshop advances in statistical hydrology. May 23–25, Taormina, Italy.
Muñoz P, Orellana-Alvear J, Willems P, Célleri R (2018). Flash-flood forecasting in an andean mountain catchment-development of a step-wise methodology based on the random forest algorithm. Water (Switzerland), 10.
Negnevitsky M (2002) Artificial intelligence: a guide to intelligent systems. Addison–Wesley/Pearson, Harlow Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343
Ouma YO, Tateishi R (2014) Urban flood vulnerability and risk mapping using integrated multi-parametric AHP and GIS: methodological overview and case study assessment. Water 6(6):1515–1545
Padalia H, Srivastava V, Kushwaha SPS (2014) Modeling potential invasion range of alien invasive species, Hyptis suaveolens (L.) Poit. in India: Comparison of MaxEnt and GARP. Ecological inf 22:36–43
Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26(1):217–222
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Statist 33(3):1065–1076
Peterson AT, Ortega-Huerta MA, Bartley J, Sánchez-Cordero V, Soberón J, Buddemeier RH, Stockwell DR (2002) Future projections for Mexican faunas under global climate change scenarios. Nature 416(6881):626. https://doi.org/10.1038/416626a
Peterson AT, Stockwell DRB, Kluza DA (2002) Distributional prediction based on ecological niche modeling of primary occurrence data. In: Scott JM, Heglund PJ, Morrison ML, Haufler JB, Raphael MG, Wall WA, Samson FB (eds) Predicting Species Occurrences. Island Press, Washington, Issues of Accuracy and Scale, pp 617–623
Pham BT, Bui DT, Pourghasemi HR, Indra P, Dholakia M (2017) Landslide susceptibility assessment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceotron neural network, and functional tree methodes. Theor Appl Climatol 128(1–2):255–273
Phillips SJ (2010). A brief tutorial on maxent. Lessons Conserv 3, 108e135.
Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190:231–259
Phillips SJ, Dudík M (2008) Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography 31:161–175
Phillips SJ, Dudík M, Schapire RE (2004) A maximum entropy approach to species distribution modeling. In: Proceedings of the Twenty-first International Conference on Machine Learning. ACM, p. 83.
Phillips, S.J., Dudík, M., Schapire, R.E., 2018. [Internet]. Maxent Software for Modeling Species Niches and Distributions (Version 3.4.1). Available from url. http://biodiversityinformatics.amnh.org/open_source/maxent/. (Accessed 9 January 2018).
Report PSCE (2013) To study the organization and disposal of surface water of passages in Kermanshah. Kermanshah municipality, Padidab Sepahan Consulting Engineers, Employer
Rahmati O, Darabi H, Haghighi AT, Stefanidis S, Kornejady A, Nalivan OA, Tien Bui D (2019) Urban Flood Hazard Modeling Using Self-Organizing Map Neural Network. Water 11(11):2370
Rahmati O, Pourghasemi HR (2017) Identification of critical flood prone areas in data-scarce and ungauged regions: a comparison of three data mining models. Water Resour Manage 31(5):1473–1487
Ridolfi E, Montesarchio V, Russo F, Napolitano F (2011) An entropy approach for evaluating the maximum information content achievable by an urban rainfall network. Natural Hazards and Earth System Sciences 11:2075–2083
Roslee R, Norhisham MN (2018) Flood susceptibility analysis using multi-criteria evaluation model: A case study in Kota Kinabalu, Sabah. ASM Science Journal 11:123–123
Sobek-Swant S, Kluza DA, Cuddington K, Lyons DB (2012) Potential distribution of emerald ash borer: What can we learn from ecological niche models using Maxent and GARP? For Ecol Manage 281:23–31
Soni J, Ansari U, Sharma D, Soni S (2011) Predictive data mining for medical diagnosis: an overview of heart disease prediction. Int J Comp Appl 17(8):43–48
Stockwell D (1999) The GARP modelling system: problems and solutions to automated spatial prediction. Int J Geogr Inf Sci 13:143–158
Tang Z, Yi S, Wang C, Xiao Y (2018) Incorporating probabilistic approach into local multi-criteria decision analysis for flood susceptibility assessment. Stoch Env Res Risk Assess 32:701–714
Tehrany MS, Pradhan B, Mansor S, Ahmad N (2015) Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. CATENA 125:91–101
Terti G, Ruin I, Gourley JJ, Kirstetter P, Flamig Z, Blanchet J, Arthur A, Anquetin S (2019) Toward probabilistic prediction of flash flood human impacts. Risk Anal 39:140–161
Therneau TM, Atkinson B, Ripley B (2014) rpart: Recursive Partitioning and Regression Trees. R package version. http://CRAN.R-project.org/package=rpart.
Thieken AH, Müller M, Kreibich H, Merz B (2005) Flood damage and influencing factors: New insights from the August 2002 flood in Germany. Water Resour Res. https://doi.org/10.1029/2005WR004177
Torabi Haghighi A, Menberu MW, Darabi H, Akanegbu J, Kløve B (2018) Use of remote sensing to analyse peatland changes after drainage for peat extraction. Land Degrad Dev 29(10):3479–3488
Trenberth KE (2011) Changes in precipitation with climate change. Climate Res 47(1–2):123–138
USDA (1986) Urban Hydrology for Small Watersheds, Technical Release 55. USDA Natural Resources Conservation Service, Washington, DC
Wang Z, Lai C, Chen X, Yang B, Zhao S, Bai X (2015) Flood hazard risk assessment model based on random forest. J Hydrol 527:1130–1141
Wei B, Wang R, Hou K, Wang X, Wu W (2018) Predicting the current and future cultivation regions of Carthamus tinctorius L using MaxEnt model under climate change in China. Global ecology and conserv 16:00477
Wiles JJ, Levine NS (2002) A combined GIS and HEC model for the analysis of the effect of urbanization on flooding; the Swan Creek watershed. Ohio Environ Eng Geosci 8(1):47–61
Yan J, Jin J, Chen F, Yu G, Yin H, Wang W (2018) Urban flash flood forecast using support vector machine and numerical simulation. J Hydroinf 20:232–245
Yesilnacar EK (2005) The Application of Computational Intelligence to Landslide Susceptibility Mapping in Turkey; Ph.D Thesis. Department of Geomatics, University of Melbourne, 423p.
Youssef AM, Pradhan B, Sefry SA (2016) Flash flood susceptibility assessment in Jeddah city (Kingdom of Saudi Arabia) using bivariate and multivariate statistical models. Environ Earth Sci 75:1–16
Zhao G, Pang B, Xu Z, Yue J, Tu T (2018) Mapping flood susceptibility in mountainous areas on a national scale in China. Sci Total Environ 615:1133–1142
Acknowledgements
This work has been supported by the Jundi-Shapur University of technology (JSU) in context of Graduate Study Program (GSP). The authors would like to thank the Urban Development Department and the Municipality of Kermanshah city, Iran for providing the needed information.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Norallahi, M., Seyed Kaboli, H. Urban flood hazard mapping using machine learning models: GARP, RF, MaxEnt and NB. Nat Hazards 106, 119–137 (2021). https://doi.org/10.1007/s11069-020-04453-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11069-020-04453-3