Skip to main content

Advertisement

Log in

Comparative assessment of bivariate, multivariate and machine learning models for mapping flood proneness

  • Original Paper
  • Published:
Natural Hazards Aims and scope Submit manuscript

Abstract

This study applied two bivariate statistical models (frequency ratio and information value), one multivariate statistical model (logistic regression), and two supervised statistical learning models (boosted regression trees and classification and regression trees) for mapping flood proneness in an arid region of southern Iraq. For this purpose, ten flood causative factors were chosen based on data availability and local conditions along with the spatial extent of the large flood that affected the study area on 13 May 2013. The factors used involved topography-related factors (elevation, slope, curvature, topographic wetness index, and stream power index), lithology, soil, land use/land cover, the average of annual rainfall, and distance to rivers. The multicollinearity test proved that there was no multicollinearity problem among the factors used. Investigating the worth of factors in building the models using information gain ratio showed that the most important factors that play a major role in controlling flood proneness were elevation, followed by annual rainfall average, distance to rivers, land use/land cover, lithology, and soil. The models were employed using the most important factors to get flood proneness maps. The values of flood proneness were categorized into five classes using a quantile classification scheme. For validating the models, area under the receiver operating characteristic curve (AUC) was used. The AUC for prediction data set was 0.793, 0.786, 0.779, 0.754, and 0.753 for classification and regression trees, boosted regression trees, logistic regression, information value, and frequency ratio, respectively. For the best performance model (classification and regression trees), the areas occupied by flood proneness zones were 2735 km2, 2809 km2, 2816 km2, 2732 km2, and 2801 km2, for very low, low, moderate, high, and very high flood proneness zones, respectively. The main conclusion is that the machine learning models are optimal in mapping flood proneness in the study area, followed by the multivariate and bivariate models. Decision makers and hydrologists for improved management of access floodwater and prevention of flood-related damages can adopt the flood proneness maps developed in this study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alaa M. Al-Abadi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Abadi, A.M., Al-Najar, N.A. Comparative assessment of bivariate, multivariate and machine learning models for mapping flood proneness. Nat Hazards 100, 461–491 (2020). https://doi.org/10.1007/s11069-019-03821-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11069-019-03821-y

Keywords

Navigation