Mathematical Geosciences

, Volume 49, Issue 6, pp 717–735 | Cite as

Machine Learning Based Predictive Modeling of Debris Flow Probability Following Wildfire in the Intermountain Western United States

  • Ashley N. Kern
  • Priscilla Addison
  • Thomas Oommen
  • Sean E. Salazar
  • Richard A. Coffman
Article

Abstract

It has been recognized that wildfire, followed by large precipitation events, triggers both flooding and debris flows in mountainous regions. The ability to predict and mitigate these hazards is crucial in protecting public safety and infrastructure. A need for advanced modeling techniques was highlighted by re-evaluating existing prediction models from the literature. Data from 15 individual burn basins in the intermountain western United States, which contained 388 instances and 26 variables, were obtained from the United States Geological Survey (USGS). After randomly selecting a subset of the data to serve as a validation set, advanced predictive modeling techniques, using machine learning, were implemented using the remaining training data. Tenfold cross-validation was applied to the training data to ensure nearly unbiased error estimation and also to avoid model over-fitting. Linear, nonlinear, and rule-based predictive models including naïve Bayes, mixture discriminant analysis, classification trees, and logistic regression models were developed and tested on the validation dataset. Results for the new non-linear approaches were nearly twice as successful as those for the linear models, previously published in debris flow prediction literature. The new prediction models advance the current state-of-the-art of debris flow prediction and improve the ability to accurately predict debris flow events in wildfire-prone intermountain western United States.

Keywords

Burn severity Naïve Bayes Mixture discriminant analysis Remote sensing 

Notes

Acknowledgements

This project was funded by the US Department of Transportation (USDOT) through the Office of the Assistant Secretary for Research and Technology. The authors would also like to thank the following individuals for their contributions to the work described: Caesar Singh, USDOT program manager, and Susan Cannon for providing data and necessary guidance.

References

  1. Agresti A (2002) Introduction to generalized linear models. In: Balding DJ, Bloomfield P, Noel NAC, Fisher NI, Johnstone IM, Kadane JB, Ryan LM, Scott DW, Smith AFM, Teugels JL (eds) Categorical data analysis, 2nd edn. Wiley, Hoboken, pp 115–164Google Scholar
  2. Bailey RW, Craddock GW, Croft AR (1947) Watershed management for summer flood control in Utah. US Department of Agriculture, WashingtonGoogle Scholar
  3. Benediktsson JA, Swain PH, Ersoy OK (1990) Neural network approaches versus statistical-methods in classification of multisource remote-sensing data. IEEE Trans Geosci Remote Sens 28(4):540–52. doi:10.1109/TGRS.1990.572944 CrossRefGoogle Scholar
  4. Cannon SH (2001) Debris flow generation from recently burned watersheds. Environ Eng Geosci 7:321–341. doi:10.2113/gseegeosci.7.4.321 CrossRefGoogle Scholar
  5. Cannon SH, Degraff JV (2009) The increasing wildfire and post-fire debris-flow threat in Western USA, and implications for consequences of climate change. In: Sassa K, Canuti P (eds) Landslides—disaster risk reduction, 1st edn. Springer, Berlin, pp 177–190Google Scholar
  6. Cannon SH, Gartner JE (2005) Wildfire-related debris flow from a hazards perspective. In: Debris flow hazards and related phenomena, 1st edn. Springer, Berlin, pp 363–385Google Scholar
  7. Cannon SH, Gartner JE, Rupert MG, Michael JA, Rea AH, Parrett C (2010) Predicting the probability and volume of postwildfire debris flows in the intermountain Western United States. Geol Soc Am Bull 122:127–44CrossRefGoogle Scholar
  8. Cannon SH, Kirkham RM, Parise M (2000) Wildfire-related debris-flow initiation process, Storm King Mountain, Colorado. Geomorphology 39:171–188. doi:10.1016/S0169-555X(00)00108-2 CrossRefGoogle Scholar
  9. Clark J (2013) Remote sensing and geospatial support to burned area emergency response (BAER) teams in assessing wildfire effects to hillslopes. In: Landslide science and practice, vol 4, global environmental change, 1st edn. Springer, Berlin, pp 211–215Google Scholar
  10. Clemmensen L, Hastie T, Witten D, Ersboll B (2011) Sparse discriminant analysis. Technometrics 53(4):406–413. doi:10.1198/TECH.2011.08118 CrossRefGoogle Scholar
  11. De Graff JV (2014) Improvement in quantifying debris flow risk for post-wildfire emergency response. Geoenviron Disasters. doi:10.1186/s40677-014-0005-2 Google Scholar
  12. De Graff JV, Lewis DS (1989) Using past landslide activity to guide post-wildfire mitigation. In: Engineering geology and geotechnical engineering, 1st edn. 25th symposium on engineering geology and geotechnical engineering, Nevada, p 65Google Scholar
  13. Eaton EC (1936) Flood and erosion control problems and their solution. Trans Am Soc Civ Eng 61:1021–1049Google Scholar
  14. Faraway J (1995) Data splitting strategies for reducing the effect of model selection on inference. Dissertation, University of MichiganGoogle Scholar
  15. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874CrossRefGoogle Scholar
  16. Fisher R (1936) The use of multiple measurements in taxonomic problems. Ann Hum Genet. doi:10.1111/j.1469-1809.1936.tb02137.x Google Scholar
  17. Freedman DA (1983) A note on screening regression equations. Am Stat. doi:10.2307/2685877 Google Scholar
  18. Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for general linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRefGoogle Scholar
  19. Gartner JE, Cannon SH, Bigio ER, Davis NK, Parrett C, Pierce KL, Rupert MG, Thurston BL, Trebish MJ, Garcia SP, Rea AH (2005) Compilation of data relating to the erosive response of 606 recently burned basins in the Western US. US Geological Survey. https://pubs.usgs.gov/of/2005/1218/Report.html
  20. Gartner JE, Cannon SH, Helsel DR, Bandurraga M (2009) Multivariate statistical models for predicting sediment yields from southern California watersheds. US Geological Survey. https://pubs.er.usgs.gov/publication/ofr20091200
  21. Gartner JK, Cannon SH, Santi PM, deWolfe VG (2007) Empirical models to predict the volumes of debris flows generated by recently burned basins in the western U.S. Geomorphology. doi:10.1016/j.geomorph.2007.02.033
  22. Gartner JE, Cannon SH, Santi PM (2011) Implementation of post-fire debris flow hazard assessment along drainage networks, southern California, U.S.A. U.S Geological Survey, Reston. doi:10.4408/IJEGE.2011-03.B-093
  23. Hardle W (2004) Nonparametric density estimation. In: Nonparametric and semiparametric models. Springer, Berlin, pp 39–83Google Scholar
  24. Harrell FE (2001) Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. Springer, New YorkCrossRefGoogle Scholar
  25. Haupt SE, Pasini A, Marzban C (2009) Artificial intelligence methods in the environmental sciences. Springer, NetherlandsCrossRefGoogle Scholar
  26. Hsieh WW (2009) Machine learning methods in the environmental sciences: neural networks and kernels. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  27. Ichoku C (2011) Earth observatory. National Aeronautics and Space Administration. http://earthobservatory.nasa.gov/GlobalMaps/view.php
  28. Key CH, Benson NC (2006) Landscape assessment: ground measure severity, the composite burn index; and remote sensing of severity, the normalized burn ratio. U.S. Geological Survey. https://pubs.er.usgs.gov/publication/2002085. Accessed 01 Jan 2015
  29. Kotsiantis S, Kannellopoulos D, Pintelas P (2006) Data preprocessing for supervised learning. Int J Comput Sci 1(2):111–117Google Scholar
  30. Krasnopolsky VM (2007) Neural network emulations for complex multidimensional geophysical mappings: applications of neural network techniques to atmospheric and oceanic satellite retrievals and numerical modeling. Rev Geophys 45(3):RG3009. doi:10.1029/2006RG000200 CrossRefGoogle Scholar
  31. Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, New YorkCrossRefGoogle Scholar
  32. Olden JD, Jackson DA (2002) A comparison of statistical approaches for modelling fish species distributions. Freshw Biol 47(10):1976–95CrossRefGoogle Scholar
  33. Oommen T, Misra D, Twarakavi NKC, Prakash A, Sahoo B, Bandopadhyay S (2008) An objective analysis of support vector machine based classification for remote sensing. Math Geosci 40:409. doi:10.1007/s11004-008-9156-6 CrossRefGoogle Scholar
  34. RStudio Team (2015) RStudio: integrated development for R. RStudio, Inc. http://www.rstudio.com/
  35. Rupert MG, Cannon SH, Gartner JE (2003) Using logistic regression to predict the probability of debris flows occurring in areas of recently burned by Wildland Fires. U.S. Geological Survey. https://pubs.er.usgs.gov/publication/ofr03500
  36. Rupert MG, Cannon SH, Gartner JE, Michael JA, Helsel DR (2008) Using logistic regression to predict the probability of debris flows in areas Burned by Wildfires, southern California, 2003–2006. U.S. Geological Survey. https://pubs.gs.gov/of/2008/1370/
  37. Sahoo BC, Oommen T, Misra D, Newby G (2007) Using the one-dimensional S-transform as a discrimination tool in classification of hyperspectral images. Can J Remote Sens 33(6):551–560CrossRefGoogle Scholar
  38. Samui P, Gowda P, Oommen T, Howell T, Marek T (2012) Statistical learning algorithms for identifying contrasting tillage practices with Landsat Thematic Mapper data. Int J Remote Sens 33:5732–5745CrossRefGoogle Scholar
  39. Santi PM, Victor G, Dewolfe JV, Higgins D, Cannon SH, Gartner JE (2007) Sources of debris flow material in burned areas. Geomorphology. doi:10.1016/j.geomorph.2007.02.022 Google Scholar
  40. Schwartz GE, Alexander RB (1995) Soils data for the conterminous United States derived from the NRCS state soil geographic (STATSGO) data base. U.S. Geological Survey. https://water.usgs.gov/GIS/metadata/usgswrd/XML/ussoils.xml. Accessed 01 Jan 2015
  41. Staley DM (2014) Emergency assessment of post-fire debris-flow hazards for the 2013 Springs Fire, Ventura County, California. U.S. Geological Survey, Reston. doi:10.3133/ofr20141001 CrossRefGoogle Scholar
  42. Steyerberg EW, Harrell FE (2015) Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol. doi:10.1016/j/jclinepi.2015.04.005
  43. Welch B (1939) Note on discriminant functions. Biometrika. doi:10.2307/2334985 Google Scholar
  44. Wells WG II (1987) The effects of fire on the generation of debris flows in southern California. Geol Soc Am. doi:10.1130/REG7-p105 Google Scholar

Copyright information

© International Association for Mathematical Geosciences 2017

Authors and Affiliations

  • Ashley N. Kern
    • 1
  • Priscilla Addison
    • 1
  • Thomas Oommen
    • 1
  • Sean E. Salazar
    • 2
  • Richard A. Coffman
    • 2
  1. 1.Department of Geological and Mining Engineering and ScienceMichigan Technological UniversityHoughtonUSA
  2. 2.Department of Civil EngineeringUniversity of ArkansasFayettevilleUSA

Personalised recommendations