Skip to main content

Advertisement

Log in

Urban flood hazard mapping using machine learning models: GARP, RF, MaxEnt and NB

  • Original Paper
  • Published:
Natural Hazards Aims and scope Submit manuscript

Abstract

Rapid urban development, increasing impermeable surfaces, poor drainage system and changes in extreme precipitations are the most important factors that nowadays lead to increased urban flooding and it has become an urban problem. Urban flood mapping and its use in making an urban development plan can reduce flood damages and losses. Constantly producing urban flood hazard maps using models that rely on the availability of detailed hydraulic-hydrological data is a major challenge especially in developing countries. In this study, urban flood hazard map was produced with limited data using three machine learning models: Genetic Algorithm Rule-Set Production, Maximum Entropy (MaxEnt), Random Forest (RF) and Naïve Bayes for Kermanshah city, Iran. The flood hazard predicting factors used in modeling were: slope, land use, precipitation, distance to river, distance to channel, curve number (CN) and elevation. Flood inventory map was produced based on available reports and field surveys, that 117 flooded points and 163 non-flooded points were identified. Models performance was evaluated based on area under the receiver-operator characteristic curve (AUC-ROC), Kappa statistic and hits and miss analysis. The results show that RF model (AUC-ROC = 99.5%, Kappa = 98%, Accuracy = 90%, Success ratio = 99%, Threat score = 90% and Heidke skill score = 98%) performed better than other models. The results also showed that distance to canal, land use and CN have shown more contribution among others for modeling the flood and precipitation had the least effect among other factors. The findings show that machine learning methods can be a good alternative to distributed models to predict urban flood-prone areas where there are lack of detailed hydraulic and hydrological data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Adelabu S, Mutanga O, Adam E (2015) Testing the reliability and stability of the internal accuracy assessment of random forest for classifying tree defoliation levels using different validation methods. Geocarto International 30(7):810–821

    Google Scholar 

  • Agrawal D, Singh JK, Kumar A (2005) Maximum entropy-based conditional probability distribution runoff model. Biosys Eng 90(1):103–113

    Google Scholar 

  • Al-Abadi AM (2018) Mapping flood susceptibility in an arid region of southern Iraq using ensemble machine learning classifiers: a comparative study. Arabian J Geosci 11(9):218

    Google Scholar 

  • Alfieri L, Feyen L, Dottori F, Bianchi A (2015) Ensemble flood risk assessment in Europe under high end climate scenarios. Global Environ Change 35:199–212

    Google Scholar 

  • Anderson RP, Lew D, Peterson AT (2003) Evaluating predictive models of species’ distributions: criteria for selecting optimal models. Ecol Model 162(3):211–232

    Google Scholar 

  • Becknell BR, Imhoff JC, Kittle JL, Donigian AS, Johanson RC, (1993). Hydrological simulation program: FORTRAN. User's manual for release 10 (No. PB-94–114865/XAB). AQUA TERRA Consultants, Mountain View, CA (United States).

  • Bentlage B, Peterson AT, Cartwright P (2009) Inferring distributions of chirodropid box-jellyfishes (Cnidaria: Cubozoa) in geographic and ecological space using ecological niche modeling. Mar Ecol Prog Ser 384:121–133

    Google Scholar 

  • Bevan A, Wilson A (2013) Models of settlement hierarchy based on partial evidence. J. Archaeol. Sci. 40(5):2415–2427

    Google Scholar 

  • Boeckmann M, Joyner TA (2014) Old health risks in new places? an ecological niche model for I. ricinus tick distribution in Europe under a changing climate. Health Place 30:70–77

    Google Scholar 

  • Booker DJ, Snelder TH (2012) Comparing methods for estimating flow duration curves at ungauged sites. J Hydrol 435:78–94

    Google Scholar 

  • Chapi K, Singh VP, Shirzadi A, Shahabi H, Bui DT, Pham BT, Khosravi K (2017) A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environ Modell Software 95:229–245

    Google Scholar 

  • Chen J, Hill AA, Urbano LD (2009) A GIS-based model for urban flood inundation. J Hydrol 373(1):184–192

    Google Scholar 

  • Chen W, Li Y, Xue W, Shahabi H, Li S, Hong H, Wang X, Bian H, Zhang S, Pradhan B, Ahmad BB (2020) Modeling flood susceptibility using data-driven approaches of naïve bayes tree, alternating decision tree, and random forest methods. Sci Total Environ 701:134979

    Google Scholar 

  • Chen W, Xie X, Wang J, Pradhan B, Hong H, Bui DT, Duan Z, Ma J (2017) A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160

    Google Scholar 

  • Choubin B, Borji M, Mosavi A, Sajedi-Hosseini F, Singh VP, Shamshirband S (2019) Snow avalanche hazard prediction using machine learning methods. J Hydrol 577:123929

    Google Scholar 

  • Choubin B, Moradi E, Golshan M, Adamowski J, Sajedi-Hosseini F, Mosavi A (2019) An Ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci Total Environ 651(441):2087–2096

    Google Scholar 

  • Cooper HH, Jacob CE (1946) A generalized graphical method for evaluation formation constants and summarizing well field history. Trans Am Geophys Union 27:526–534

    Google Scholar 

  • Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT (2007) Random forests for classification in ecology. J Ecol 88(11):2783–2792

    Google Scholar 

  • Darabi H, Choubin B, Rahmati O, Haghighi AT, Pradhan B, Kløve B (2019) Urban flood risk mapping using the GARP and QUEST models: a comparative study of machine learning techniques. J Hydrol 569:142–154

    Google Scholar 

  • Darabi H, Haghighi AT, Mohamadi MA, Rashidpour M, Ziegler AD, Hekmatzadeh AA, Kløve B (2020) Urban flood risk mapping using data-driven geospatial techniques for a flood-prone case area in Iran. Hydrol Res 51(1):127–142

    Google Scholar 

  • Davies T, Fry H, Wilson A, Palmisano A, Altaweel M, Radner K (2014) Application of an entropy maximizing and dynamics model for understanding settlement structure the Khabur Triangle in the Middle Bronze and Iron Ages. J Archaeol Sci 43:141–154

    Google Scholar 

  • Dormann CF et al (2013) Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36(1):27–46

    Google Scholar 

  • Eini M, Seyed Kaboli H, Rashidian M, Hedayat H (2020) Hazard and vulnerability in urban flood risk mapping: Machine learning techniques and considering the role of urban districts. Int J Disaster Risk Reduct 50:101687

  • Fernández DS, Lutz MA (2010) Urban flood hazard zoning in Tucumán Province, Argentina, using GIS and multicriteria decision analysis. Eng Geol 111(1–4):90–98

    Google Scholar 

  • Fithian W, Hastie T (2013) Finite-sample equivalence in statistical models for presence-only data. The Annals of Applied Statistics. https://doi.org/10.1214/13-AOAS667

    Article  Google Scholar 

  • Fitzpatrick MC, Weltzin JF, Sanders NJ, Dunn RR (2007) The biogeography prediction error: why does the introduced range of the fire ant over-predict its native range? Global Ecol Biogeogr 16(1):24–33

    Google Scholar 

  • Giovannettone J, Copenhaver T, Burns M, Choquette S (2018) A Statistical approach to mapping flood susceptibility in the lower connecticut river valley region. Water Resour Res 54:7603–7618

    Google Scholar 

  • Gorsevski PV, Gessler PE, Foltz RB, Elliot WJ (2006) Spatial prediction of landslide hazard using logistic regression and ROC analysis. T GIS 10(3):395–415

    Google Scholar 

  • Harte J (2011) Maximum entropy and ecology: A theory of abundance, distribution, and energetics. Oxford University Press, New York

    Google Scholar 

  • Ho, T.K. 1995. Random decision forests C3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. IEEE Computer Society, 278–282.

  • Hong H, Panahi M, Shirzadi A, Ma T, Liu J, Zhu AX, Chen W, Kougias I, Kazakis N (2018) Flood susceptibility assessment in Hengfeng area coupling adaptive neuro-fuzzy inference system with genetic algorithm and differential evolution. Sci Total Environ 621:1124–1141

    Google Scholar 

  • Hosseini FS, Choubin B, Mosavi A, Nabipour N, Shamshirband S, Darabi H, Haghighi AT (2020) Flash-flood hazard assessment using ensembles and Bayesian-based machine learning models: application of the simulated annealing feature selection method. Sci Total Environ 711:135161

    Google Scholar 

  • Howey MC, Palace MW, McMichael CH (2016) Geospatial modeling approach to monument construction using Michigan from AD 1000-1600 as a case study. Proc Natl Acad Sci Unit States Am 113(27):7443–7448

    Google Scholar 

  • Jha R, Singh VP (2008) Evaluation of river water quality by entropy. KSCE J civil Eng 12(1):16–69

    Google Scholar 

  • Jiménez-Valverde A (2012) Insights into the area under the receiver operating characteristic curve (AUC) as a discrimination measure in species distribution modelling. Global Ecol Biogeogr 21(4):498–507

    Google Scholar 

  • Kanani-Sadat Y, Arabsheibani R, Karimipour F, Nasseri M (2019) A new approach to flood susceptibility assessment in data-scarce and ungauged regions based on GIS-based hybrid multi criteria decision-making method. J Hydrol 572:17–31

    Google Scholar 

  • Kannan G, Pokharel S, Kumar PS (2009) A hybrid approach using ISM and fuzzy TOPSIS for selection of reverse logistics provider. Resour Conserv Recycl 54(1):28–36

    Google Scholar 

  • Khosravi K, Pham BT, Chapi K, Shirzadi A, Shhabi H, Revhaug I, Prakkash I, Bui DT (2019) A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Sci Total Environ 627:744–755

    Google Scholar 

  • Khosravi K, Shhabi H, Pham BT, Adamowski J, Shirzadi A, Pradhan B, Dou J, Ly H, Grof G, Ho HL, Hong H, Chapi K, Prakash I (2019) A comparative assessment of flood susceptibility using multi-criteria decision-making analysis and machine learning methods. J Hydrol 573:311–323

    Google Scholar 

  • Knighton J, Buchanan B, Guzman C, Elliott R, White E, Rahm B (2020) Predicting flood insurance claims with hydrologic and socioeconomic demographics via machine learning: exploring the roles of topography, minority populations, and political dissimilarity. J Environ Manage 272:111051

    Google Scholar 

  • Kumar S, Stohlgren TJ (2009) Maxent modeling for predicting suitable habitat for threatened and endangered tree Canacomyrica monticola in New Caledonia. J Ecology Nat Environ 1(4):94–98

    Google Scholar 

  • Laudan J, Rözer V, Sieg T, Vogel K, Thieken AH (2017) Damage assessment in Braunsbach 2016: Data collection and analysis for an improved understanding damaging processes during flash floods. Nat Hazards Earth Sys Sci 17:2163–2179

    Google Scholar 

  • Lee S, Kim JC, Jung HS, Lee MJ, Lee S (2017) Spatial prediction of flood susceptibility using random-forest and boosted-tree models in Seoul metropolitan city, Korea. Geomatics, Nat Hazards Risk 8:1185–1203

    Google Scholar 

  • Mahmood S, Rahman A (2019a) Flash flood susceptibility modeling using geo-morphometric and hydrological approaches in Panjkora Basin Pakistan, Eastern Hindu Kush. Environ earth sci 78(1):43

    Google Scholar 

  • Mahmood S, Rahman AU, Shaw R (2019b) Spatial appraisal of flood risk assessment and evaluation using integrated hydro-probabilistic approach in Panjkora River Basin Pakistan. Environ Monitoring Assessment 191(9):573

    Google Scholar 

  • McCallum A, Nigam K (1998) A comparison of event models for Naive Bayes text classification. AAAI-98 workshop on learning for text categorization 752:41–48

    Google Scholar 

  • McNyset KM, Blackburn JK (2006) Does GARP really fail miserably. Diversity 12:782–786

    Google Scholar 

  • Merckx B, Steyaert M, Vanreusel A, Vincx M, Vanaverbeke J (2011) Null models reveal preferential sampling, spatial autocorrelation and overfitting in habitat suitability modelling. Ecol Model 222(3):588–597

    Google Scholar 

  • Mishra AK, Coulibaly P (2010) Hydrometric network evaluation for Canadian watersheds. J hydrol 380:420–437

    Google Scholar 

  • Monserud RA, Leemans R (1992) Comparing global vegetation maps with the Kappa statistic. Ecol model 62(4):275–293

    Google Scholar 

  • MontesarchioV, Napolitano F (2010) A single-site rainfall disaggregation model based on entropy. international workshop advances in statistical hydrology. May 23–25, Taormina, Italy.

  • Muñoz P, Orellana-Alvear J, Willems P, Célleri R (2018). Flash-flood forecasting in an andean mountain catchment-development of a step-wise methodology based on the random forest algorithm. Water (Switzerland), 10.

  • Negnevitsky M (2002) Artificial intelligence: a guide to intelligent systems. Addison–Wesley/Pearson, Harlow Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343

    Google Scholar 

  • Ouma YO, Tateishi R (2014) Urban flood vulnerability and risk mapping using integrated multi-parametric AHP and GIS: methodological overview and case study assessment. Water 6(6):1515–1545

    Google Scholar 

  • Padalia H, Srivastava V, Kushwaha SPS (2014) Modeling potential invasion range of alien invasive species, Hyptis suaveolens (L.) Poit. in India: Comparison of MaxEnt and GARP. Ecological inf 22:36–43

    Google Scholar 

  • Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26(1):217–222

    Google Scholar 

  • Parzen E (1962) On estimation of a probability density function and mode. Ann Math Statist 33(3):1065–1076

    Google Scholar 

  • Peterson AT, Ortega-Huerta MA, Bartley J, Sánchez-Cordero V, Soberón J, Buddemeier RH, Stockwell DR (2002) Future projections for Mexican faunas under global climate change scenarios. Nature 416(6881):626. https://doi.org/10.1038/416626a

    Article  Google Scholar 

  • Peterson AT, Stockwell DRB, Kluza DA (2002) Distributional prediction based on ecological niche modeling of primary occurrence data. In: Scott JM, Heglund PJ, Morrison ML, Haufler JB, Raphael MG, Wall WA, Samson FB (eds) Predicting Species Occurrences. Island Press, Washington, Issues of Accuracy and Scale, pp 617–623

    Google Scholar 

  • Pham BT, Bui DT, Pourghasemi HR, Indra P, Dholakia M (2017) Landslide susceptibility assessment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceotron neural network, and functional tree methodes. Theor Appl Climatol 128(1–2):255–273

    Google Scholar 

  • Phillips SJ (2010). A brief tutorial on maxent. Lessons Conserv 3, 108e135.

  • Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190:231–259

    Google Scholar 

  • Phillips SJ, Dudík M (2008) Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography 31:161–175

    Google Scholar 

  • Phillips SJ, Dudík M, Schapire RE (2004) A maximum entropy approach to species distribution modeling. In: Proceedings of the Twenty-first International Conference on Machine Learning. ACM, p. 83.

  • Phillips, S.J., Dudík, M., Schapire, R.E., 2018. [Internet]. Maxent Software for Modeling Species Niches and Distributions (Version 3.4.1). Available from url. http://biodiversityinformatics.amnh.org/open_source/maxent/. (Accessed 9 January 2018).

  • Report PSCE (2013) To study the organization and disposal of surface water of passages in Kermanshah. Kermanshah municipality, Padidab Sepahan Consulting Engineers, Employer

    Google Scholar 

  • Rahmati O, Darabi H, Haghighi AT, Stefanidis S, Kornejady A, Nalivan OA, Tien Bui D (2019) Urban Flood Hazard Modeling Using Self-Organizing Map Neural Network. Water 11(11):2370

    Google Scholar 

  • Rahmati O, Pourghasemi HR (2017) Identification of critical flood prone areas in data-scarce and ungauged regions: a comparison of three data mining models. Water Resour Manage 31(5):1473–1487

    Google Scholar 

  • Ridolfi E, Montesarchio V, Russo F, Napolitano F (2011) An entropy approach for evaluating the maximum information content achievable by an urban rainfall network. Natural Hazards and Earth System Sciences 11:2075–2083

    Google Scholar 

  • Roslee R, Norhisham MN (2018) Flood susceptibility analysis using multi-criteria evaluation model: A case study in Kota Kinabalu, Sabah. ASM Science Journal 11:123–123

    Google Scholar 

  • Sobek-Swant S, Kluza DA, Cuddington K, Lyons DB (2012) Potential distribution of emerald ash borer: What can we learn from ecological niche models using Maxent and GARP? For Ecol Manage 281:23–31

    Google Scholar 

  • Soni J, Ansari U, Sharma D, Soni S (2011) Predictive data mining for medical diagnosis: an overview of heart disease prediction. Int J Comp Appl 17(8):43–48

    Google Scholar 

  • Stockwell D (1999) The GARP modelling system: problems and solutions to automated spatial prediction. Int J Geogr Inf Sci 13:143–158

    Google Scholar 

  • Tang Z, Yi S, Wang C, Xiao Y (2018) Incorporating probabilistic approach into local multi-criteria decision analysis for flood susceptibility assessment. Stoch Env Res Risk Assess 32:701–714

    Google Scholar 

  • Tehrany MS, Pradhan B, Mansor S, Ahmad N (2015) Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. CATENA 125:91–101

    Google Scholar 

  • Terti G, Ruin I, Gourley JJ, Kirstetter P, Flamig Z, Blanchet J, Arthur A, Anquetin S (2019) Toward probabilistic prediction of flash flood human impacts. Risk Anal 39:140–161

    Google Scholar 

  • Therneau TM, Atkinson B, Ripley B (2014) rpart: Recursive Partitioning and Regression Trees. R package version. http://CRAN.R-project.org/package=rpart.

  • Thieken AH, Müller M, Kreibich H, Merz B (2005) Flood damage and influencing factors: New insights from the August 2002 flood in Germany. Water Resour Res. https://doi.org/10.1029/2005WR004177

    Article  Google Scholar 

  • Torabi Haghighi A, Menberu MW, Darabi H, Akanegbu J, Kløve B (2018) Use of remote sensing to analyse peatland changes after drainage for peat extraction. Land Degrad Dev 29(10):3479–3488

    Google Scholar 

  • Trenberth KE (2011) Changes in precipitation with climate change. Climate Res 47(1–2):123–138

    Google Scholar 

  • USDA (1986) Urban Hydrology for Small Watersheds, Technical Release 55. USDA Natural Resources Conservation Service, Washington, DC

    Google Scholar 

  • Wang Z, Lai C, Chen X, Yang B, Zhao S, Bai X (2015) Flood hazard risk assessment model based on random forest. J Hydrol 527:1130–1141

    Google Scholar 

  • Wei B, Wang R, Hou K, Wang X, Wu W (2018) Predicting the current and future cultivation regions of Carthamus tinctorius L using MaxEnt model under climate change in China. Global ecology and conserv 16:00477

    Google Scholar 

  • Wiles JJ, Levine NS (2002) A combined GIS and HEC model for the analysis of the effect of urbanization on flooding; the Swan Creek watershed. Ohio Environ Eng Geosci 8(1):47–61

    Google Scholar 

  • Yan J, Jin J, Chen F, Yu G, Yin H, Wang W (2018) Urban flash flood forecast using support vector machine and numerical simulation. J Hydroinf 20:232–245

    Google Scholar 

  • Yesilnacar EK (2005) The Application of Computational Intelligence to Landslide Susceptibility Mapping in Turkey; Ph.D Thesis. Department of Geomatics, University of Melbourne, 423p.

  • Youssef AM, Pradhan B, Sefry SA (2016) Flash flood susceptibility assessment in Jeddah city (Kingdom of Saudi Arabia) using bivariate and multivariate statistical models. Environ Earth Sci 75:1–16

    Google Scholar 

  • Zhao G, Pang B, Xu Z, Yue J, Tu T (2018) Mapping flood susceptibility in mountainous areas on a national scale in China. Sci Total Environ 615:1133–1142

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the Jundi-Shapur University of technology (JSU) in context of Graduate Study Program (GSP). The authors would like to thank the Urban Development Department and the Municipality of Kermanshah city, Iran for providing the needed information.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hesam Seyed Kaboli.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Norallahi, M., Seyed Kaboli, H. Urban flood hazard mapping using machine learning models: GARP, RF, MaxEnt and NB. Nat Hazards 106, 119–137 (2021). https://doi.org/10.1007/s11069-020-04453-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11069-020-04453-3

Keywords

Navigation