Comparative assessment of bivariate, multivariate and machine learning models for mapping flood proneness

Al-Abadi, Alaa M.; Al-Najar, Noor A.

doi:10.1007/s11069-019-03821-y

Comparative assessment of bivariate, multivariate and machine learning models for mapping flood proneness

Original Paper
Published: 26 November 2019

Volume 100, pages 461–491, (2020)
Cite this article

Natural Hazards Aims and scope Submit manuscript

713 Accesses
24 Citations
Explore all metrics

Abstract

This study applied two bivariate statistical models (frequency ratio and information value), one multivariate statistical model (logistic regression), and two supervised statistical learning models (boosted regression trees and classification and regression trees) for mapping flood proneness in an arid region of southern Iraq. For this purpose, ten flood causative factors were chosen based on data availability and local conditions along with the spatial extent of the large flood that affected the study area on 13 May 2013. The factors used involved topography-related factors (elevation, slope, curvature, topographic wetness index, and stream power index), lithology, soil, land use/land cover, the average of annual rainfall, and distance to rivers. The multicollinearity test proved that there was no multicollinearity problem among the factors used. Investigating the worth of factors in building the models using information gain ratio showed that the most important factors that play a major role in controlling flood proneness were elevation, followed by annual rainfall average, distance to rivers, land use/land cover, lithology, and soil. The models were employed using the most important factors to get flood proneness maps. The values of flood proneness were categorized into five classes using a quantile classification scheme. For validating the models, area under the receiver operating characteristic curve (AUC) was used. The AUC for prediction data set was 0.793, 0.786, 0.779, 0.754, and 0.753 for classification and regression trees, boosted regression trees, logistic regression, information value, and frequency ratio, respectively. For the best performance model (classification and regression trees), the areas occupied by flood proneness zones were 2735 km², 2809 km², 2816 km², 2732 km², and 2801 km², for very low, low, moderate, high, and very high flood proneness zones, respectively. The main conclusion is that the machine learning models are optimal in mapping flood proneness in the study area, followed by the multivariate and bivariate models. Decision makers and hydrologists for improved management of access floodwater and prevention of flood-related damages can adopt the flood proneness maps developed in this study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Flood-prone area mapping using machine learning techniques: a case study of Quang Binh province, Vietnam

Article 10 June 2021

Flood susceptibility prediction using four machine learning techniques and comparison of their performance at Wadi Qena Basin, Egypt

Article 11 September 2020

Advanced machine learning algorithms for flood susceptibility modeling — performance comparison: Red Sea, Egypt

Article 04 May 2022

References

Abeare S (2009) Comparisons of boosted regression tree, GLM and GAM performance in the standardization of yellowfin tuna catch-rate data from the Gulf of Mexico lonline [sic] fishery. LSU Master's Theses. 2880. https://digitalcommons.lsu.edu/gradschool_theses/2880
Aertsen W, Kint K, Vos BD, Deckers J, Orshoven JV, Muys B (2012) Predicting forest site productivity in temperate lowland from forest floor, soil and litterfall characteristics using boosted regression trees. Plant Soil 354(1–2):157–172. https://doi.org/10.1007/s11104-011-1052-z
Article Google Scholar
Al-Abadi AM (2017) A novel geographical information system-based Ant Miner algorithm model for delineating groundwater flowing artesian well boundary: a case study from Iraqi southern and western deserts. Environ Earth Sci 76:534. https://doi.org/10.1007/s12665-017-6876-2
Article Google Scholar
Al-Abadi AM (2018) Mapping flood susceptibility in an arid region of southern Iraq using ensemble machine learning classifiers: a comparative study. Arab J Geosci 11:218. https://doi.org/10.1007/s12517-018-3584-5
Article Google Scholar
Al-Abadi AM, Al-Temmeme AA, Al-Ghanimy MA (2016) A GIS-based combining of frequency ratio and index of entropy approaches for mapping groundwater availability zones at Badra–Al Al-Gharbi–Teeb areas Iraq. Sustain Water Resour Manag 2(3):265–283. https://doi.org/10.1007/s40899-016-0056-5
Article Google Scholar
Al-Abadi AM, Handhal AM, Al-Ginamy MA (2019) Evaluating the Dibdibba aquifer productivity at the Karbala-Najaf plateau (Central Iraq) using GIS-based tree machine learning algorithms. Nat Resour Res. https://doi.org/10.1007/s11053-019-09561-x
Article Google Scholar
Al-Abadi AM, Shahid S (2016) Spatial mapping of artesian zone at Iraqi southern desert using a GIS-based random forest machine learning model. Model Earth Syst Environ 2(2):96. https://doi.org/10.1007/s40808-016-0150-6
Article Google Scholar
Allison PD (1999) Multiple regression: a primer. Pine Forge Press, Newbury Park
Google Scholar
Atiaa AM (2012) Hydrological and hydrogeological of analysis of northeastern Missan governorate, south of Iraq using geographic information system. Unpublished PhD thesis, College of Science, Baghdad University, Baghdad, Iraq. www.scbaghdad.edu.iq/.../Hydrological%20and%20hydrogeological%20analysis.pdf
Ayalew L, Yamagishi H (2005) The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology 65(1–2):15–31. https://doi.org/10.1016/j.geomorph.2004.06.010
Article Google Scholar
Bai SB, Wang J, Guo NL, Zhou PG, Hou SS, Xu SN (2010) GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the three Gorgesarea, China. Geomorphology 115(1–2):23–31. https://doi.org/10.1016/j.geomorph.2009.09.025
Article Google Scholar
Bonham-Carter GF (1994) Geographic information systems for geoscientists. Pergamon Press, Modeling with GIS, Oxford, p 398
Google Scholar
Butler D, Kokkalidou A, Makropoulos CK (2006) Supporting the siting of new urban developments for integrated urban water resource management. In: Hlavinek P, Kukharchyk T, Marsalek J, Mahrikova I (eds) Integrated urban water resources management. Springer, Dordrecht, pp 19–34
Chapter Google Scholar
Cao C, Xu P, Wang Y, Chen J, Zheng L, Niu C (2016) Flash flood hazard susceptibility mapping using frequency ratio and statistical index methods in coalmine subsidence areas. Sustainability 8(9):1–18. https://doi.org/10.3390/su8090948
Article Google Scholar
Cervi F, Berti M, Borgatti L, Ronchetti F, Manenti F, Corsini A (2010) Comparing predictive capability of statistical and deterministic methods for landslide susceptibility mapping: a case study in the northern Apennines (Reggio Emilia Province, Italy). Landslides 7(4):433–444. https://doi.org/10.1007/s10346-010-0207-y
Article Google Scholar
Chauhan S, Sharma M, Arora MK (2010) Landslide susceptibility zonation of the Chamoli region, Garhwal Himalayas, using logistic regression model. Landslides 7(4):411–423
Article Google Scholar
Chen W, Shirzadi A, Shahabi H, Ahmad BB, Zhang S, Hong H, Zhang N (2017) A novel hybrid artificial intelligence approach based on the rotation forest ensemble and naïve Bayes tree classifiers for a landslide susceptibility assessment in Langao County, China. Geomat Nat Hazards Risk 8(2):1955–1977. https://doi.org/10.1080/19475705.2017.1401560
Article Google Scholar
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, pp. 2172–2180
Choubin B, Moradi E, Golshan M, Adamowski J, Sajedi-Hosseini F, Mosavi A (2019) An ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci Total Environ 651:2087–2096. https://doi.org/10.1016/j.scitotenv.2018.10.064
Article Google Scholar
Elith J, Leathwick JR, Hastie T (2008) A working guide to boosted regression trees. J Anim Ecol 77:802–813. https://doi.org/10.1111/j.1365-2656.2008.01390.x
Article Google Scholar
Felicísimo ÁM, Cuartero A, Remondo J, Quirós E (2013) Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: a comparative study. Landslides 10(2):175–189. https://doi.org/10.1007/s10346-012-0320-1
Article Google Scholar
Hong H, Tsangaratos P, Ilia I, Liu J, Zhu A-X, Chen W (2018) Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Sci Total Environ 625:575–588. https://doi.org/10.1016/j.scitotenv.2017.12.256
Article Google Scholar
ISRO (2013) Flood water over Wasit governorate, Republic of Iraq. https://reliefweb.int/map/iraq/flood-waters-over-wasit-governoraterepublic-iraq
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer, New York
Book Google Scholar
Jassim SZ, Goff JC (2006) Geology of Iraq. Dolin, Prague and Moravian Museum, Brno, p 431
Google Scholar
Krivoruchko K (2011) Spatial statistical data analysis for GIS users. Esri Press, Redlands, p 928
Google Scholar
Lee S, Sambath T (2006) Landslide susceptibility mapping in the Damrei Romel area, Cambodia using frequency ratio and logistic regression models. Environ Geol 50(6):847–855. https://doi.org/10.1007/s00254-006-0256-7
Article Google Scholar
Lee S, Kim YS, Oh HJ (2012) Application of a weigh-of-evidence method and GIS to regional groundwater productivity potential mapping. Environ Manage 96(1): 91–105
Google Scholar
Lucà F, Conforti F, Robustelli G (2011) Comparison of GIS-based gullying susceptibility mapping using bivariate and multivariate statistics: Northern Calabria, South Italy. Geomorphology 134(3–4):297–308. https://doi.org/10.1016/j.geomorph.2011.07.006
Article Google Scholar
Mahjoobi J, Etemad-Shahidi A (2008) An alternative approach for the prediction of significant wave heights based on classification and regression trees. Appl Ocean Res 30(3):172–177. https://doi.org/10.1016/j.apor.2008.11.001
Article Google Scholar
Mason SJ, Graham NE (2002) Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: statistical significance and interpretation. Q J R Meteorol Soc 128(584):2145–2166. https://doi.org/10.1256/003590002320603584
Article Google Scholar
McDonald RC, Isbell RF, Speight JG, Walker J, Hopkins MS (1990) Australian land and soil survey field handbook, 2nd ed. Inkata Press, Melbourne
Google Scholar
Moore ID, Grayson RB, Ladson AR (1991) Digital terrain modeling—a review of hydrological, geomorphological, and biological applications. Hydrol Process 5(1):3–30. https://doi.org/10.1002/hyp.3360050103
Article Google Scholar
Oh HJ, Lee HJ, Lee C, Lee S (2011) Combining landslide susceptibility maps obtained from frequency ratio, logistic regression, and artificial neural network models using ASTER images and GIS. Eng Geol 124:12–23. https://doi.org/10.1016/j.enggeo.2011.09.011
Article Google Scholar
Parsons RM (1956) Groundwater resources of Iraq. Khanaqin-Jassan area, vol 1. Development Board, Ministry of Development, Government of Iraq. ILL/Joseph R. Skeen Library, New Maxico Institute of Mining and Technology, Socorro, NM 87801
Pourghasemi HR, Moradi HR, Aghda SF (2013) Landslide susceptibility mapping by binary logistic regression, analytical hierarchy process, and statistical index models and assessment of their performances. Nat Hazards 69(1):749–779. https://doi.org/10.1007/s11069-013-0728-5
Article Google Scholar
Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–365. https://doi.org/10.1016/j.cageo.2012.08.023
Article Google Scholar
Pradhan B, Shafiee M, Pirasteh S (2009) Maximum flood prone area mapping using RADARSAT images and GIS: Kelantan river basin. Int J Geoinform 5(2):11–23
Google Scholar
Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9(2):181–199. https://doi.org/10.1007/s10021-005-0054-1
Article Google Scholar
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Google Scholar
Rahmati O, Zeinivand H, Besharat M (2015) Flood hazard zoning in Yasooj region, Iran, using GIS and multi-criteria decision analysis. Geomat Nat Hazards Risk 7(3):1000–1017. https://doi.org/10.1080/19475705.2015.1045043
Article Google Scholar
Ramani SE, Pitchaimani K, Gnanamanickam VR (2011) GIS based landslide susceptibility mapping of Tevankarai Arsub-watershed, Kodaikkanal, India using binary logistic regression analysis. J Mountain Sci 8(4):505–517. https://doi.org/10.1007/s11629-011-2157-9
Article Google Scholar
Samanta RK, Bhunia GS, Shit PK, Pourghasemi HR (2018) Flood susceptibility mapping using geospatial frequency ratio technique: a case study of Subarnarekha River Basin, India. Model Earth Syst Environ 4(1):395–408. https://doi.org/10.1007/s40808-018-0427-z
Article Google Scholar
Shirzadi A, Saro L, Joo OH, Chapi K (2012) A GIS-based logistic regression model in rock-fall susceptibility mapping along a mountainous road: Salavat Abad case study, Kurdistan, Iran. Nat Hazards 64(2):1639–1656. https://doi.org/10.1007/s11069-012-0321-3
Article Google Scholar
Srinivas VV, Tripathi S, Rao AR, Govindaraju RS (2008) Regional flood frequency analysis by combining self-organizing feature maps and fuzzy clustering. J Hydrol 348(1–2):148–166. https://doi.org/10.1016/j.jhydrol.2007.09.046
Article Google Scholar
Swets JA (1973) The relative operating characteristic in psychology. Science 182(4116):990–1000. https://doi.org/10.1126/science.182.4116.990
Article Google Scholar
Tehrany MS, Lee M-J, Pradhan B, Jebur MN, Lee S (2014) Flood susceptibility mapping using integrated bivariate and multivariate statistical models. Environ Earth Sci 72(10): 4001–4015. https://doi.org/10.1007/s12665-014-3289-3
Article Google Scholar
Tehrany M, Shabani F, Jebur M, Chen W, Xie X (2017) GIS-based spatial prediction of flood prone areas using standalone frequency ratio, logistic regression. Geomat Nat Hazards Risk. https://doi.org/10.1080/19475705.2017.1362038
Article Google Scholar
Tien Bui D, Pradhan B, Nampak H, Bui QT, Tran QA, Nguyen QP (2016a) Hybrid artificial intelligence approach based on neural fuzzy inference model and metaheuristic optimization for flood susceptibility modeling in a high-frequency tropical cyclone area using GIS. J Hydrol 540:317–330. https://doi.org/10.1016/j.jhydrol.2016.06.027
Article Google Scholar
Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016b) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13(2):361–378. https://doi.org/10.1007/s10346-015-0557-6
Article Google Scholar
van Westen CJ (1993)Application of geographical information system to landslide hazard zonation. International Institute for Aerospace Survey and Earth Sciences, Enschede. ITC Publication. The Netherlands
Xiaomeng W, Borgelt C (2004) Information measures in fuzzy decision trees. IEEE 1:85–90. https://doi.org/10.1109/FUZZY.2004.1375694
Article Google Scholar
Yesilnacar E, Topal T (2005) Landslide susceptibility mapping: a comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng Geol 79(3–4):251–266. https://doi.org/10.1016/j.enggeo.2005.02.002
Article Google Scholar
Yin KJ, Yan TZ (1988) Statistical prediction model for slope instability of metamorphosed rocks. In: Proceedings of the 5th international symposium on landslides, Lausanne, vol 2, pp 1269–1272
Youssef AM, Pradhan B, Hassan AM (2011) Flash flood risk estimation along the St. Katherine road, southern Sinai, Egypt using GIS based morphometry and satellite imagery. Environ Earth Sci 62(3):611–623. https://doi.org/10.1007/s12665-010-0551-1
Article Google Scholar
Youssef AM, Pourghasemi HR, Pourtaghi, Z, Al-Katheeri MM (2015) Landslides susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides 13(5):839–856. https://doi.org/10.1007/s10346-015-0614-1
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Geology, College of Science, University of Basrah, Basrah, Iraq
Alaa M. Al-Abadi & Noor A. Al-Najar

Authors

Alaa M. Al-Abadi
View author publications
You can also search for this author in PubMed Google Scholar
Noor A. Al-Najar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alaa M. Al-Abadi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Abadi, A.M., Al-Najar, N.A. Comparative assessment of bivariate, multivariate and machine learning models for mapping flood proneness. Nat Hazards 100, 461–491 (2020). https://doi.org/10.1007/s11069-019-03821-y

Download citation

Received: 06 February 2019
Accepted: 19 November 2019
Published: 26 November 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s11069-019-03821-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative assessment of bivariate, multivariate and machine learning models for mapping flood proneness

Abstract

Access this article

Similar content being viewed by others

Flood-prone area mapping using machine learning techniques: a case study of Quang Binh province, Vietnam

Flood susceptibility prediction using four machine learning techniques and comparison of their performance at Wadi Qena Basin, Egypt

Advanced machine learning algorithms for flood susceptibility modeling — performance comparison: Red Sea, Egypt

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparative assessment of bivariate, multivariate and machine learning models for mapping flood proneness

Abstract

Access this article

Similar content being viewed by others

Flood-prone area mapping using machine learning techniques: a case study of Quang Binh province, Vietnam

Flood susceptibility prediction using four machine learning techniques and comparison of their performance at Wadi Qena Basin, Egypt

Advanced machine learning algorithms for flood susceptibility modeling — performance comparison: Red Sea, Egypt

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation