Abstract
Wildfire ignition models can help in identifying risk factors and mapping high-risk areas, which addresses an urgent issue as wildfires become increasingly destructive. Despite advancements in data collection and data analysis, challenges persist as ignitions depend on numerous interconnected factors at a fine spatial resolution. Predicting wildfire ignitions from data is an imbalanced classification problem, given the vast number of non-ignition data compared to ignition data. To address this issue, this study proposes an ensemble-based model for binary classification of wildfire ignitions. The data are collected for a 24,867 km2 area in northern California from January 2014 to May 2022 and includes 76 predictors covering topographic, land cover, anthropogenic, and climatic data. Different base classifiers are evaluated and the random forest is found the most performant, yielding a recall of 0.67 and a specificity of 0.87. Feature importance analysis shows that the Topographic Wetness Index is the most important climatic predictor, while population density and land cover development are also highly rated. Comparison of yearly average of computed daily probabilities with ignition data shows that the model accurately captures the spatial pattern of ignitions, which can reveal high-risk areas. The model is then used to assess how climatic and anthropogenic factors impact wildfire ignition frequency. The projected scenarios show that the number and spread of ignitions would significantly increase with an increase in population in sparsely populated areas, while climatic factors have secondary effects in isolation but in combination may compound the risk. As current land development and climate change trends are expected to increase the frequency and severity of wildfires, data-based models can provide insights to inform policy and mitigate risk.
Similar content being viewed by others
References
Achu AL, Thomas J, Aju CD et al (2021) Machine-learning modelling of fire susceptibility in a forest-agriculture mosaic landscape of southern India. Ecol Inform 64:101348. https://doi.org/10.1016/j.ecoinf.2021.101348
Adab H, Atabati A, Oliveira S, Moghaddam Gheshlagh A (2018) Assessing fire hazard potential and its main drivers in Mazandaran province, Iran: a data-driven approach. Environ Monit Assess. https://doi.org/10.1007/s10661-018-7052-1
Adaktylou N, Stratoulias D, Landenberger R (2020) Wildfire risk assessment based on geospatial open data: application on Chios, Greece. ISPRS Int J Geo-Inf 9(9):516. https://doi.org/10.3390/ijgi9090516
Agrawal T (2021) Hyperparameter optimization in machine learning. Apress Berkeley, Berkeley, pp 81–108
Ajin RS, Ciobotaru A, Vinod PG, Jacob MK (2015) Forest and wildland fire risk assessment using geospatial techniques: a case study of Nemmara forest division, Kerala, India. J Wetl Biodivers 5:29–37
Balch JK, Bradley BA, Abatzoglou JT et al (2017) Human-started wildfires expand the fire niche across the United States. Proc Natl Acad Sci U S A 114:2946–2951. https://doi.org/10.1073/pnas.1617394114
Bayani R, Waseem M, Manshadi SD, Davani H (2022) Quantifying the risk of wildfire ignition by power lines under extreme weather conditions. IEEE Syst J 17:1024–1034
Bergado JR, Persello C, Reinke K, Stein A (2021) Predicting wildfire burns from big geodata using deep learning. Saf Sci 140:105276. https://doi.org/10.1016/j.ssci.2021.105276
Beven KJ, Kirkby MJ (1979) A physically based, variable contributing area model of basin hydrology. Hydrol Sci Bull 24:43–69. https://doi.org/10.1080/02626667909491834
Booth B, Mitchell A (2001) Getting started with ArcGIS. ESRI Press, Redlands
Catry FX, Rego FC, Bação FL, Moreira F (2009) Modeling and mapping wildfire ignition risk in Portugal. Int J Wildl Fire 18:921–931. https://doi.org/10.1071/WF07123
Chujai P, Chomboon K, Teerarassamee P et al (2015) Ensemble learning for imbalanced data classification problem. In: Proceedings of the 3rd international conference on industrial application engineering (Nakhon Ratchasima). https://doi.org/10.12792/iciae2015.079
Demange M, Di Fonso A, Di Stefano G, Vittorini P (2022) Network theory applied to preparedness problems in wildfire management. Saf Sci 152:105762. https://doi.org/10.1016/j.ssci.2022.105762
Denham M, Laneri K (2018) Using efficient parallelization in graphic processing units to parameterize stochastic fire propagation models. J Comput Sci 25:76–88. https://doi.org/10.1016/j.jocs.2018.02.007
Duane A, Castellnou M, Brotons L (2021) Towards a comprehensive look at global drivers of novel extreme wildfire events. Clim Change 165(3–4):43
Dutta R, Das A, Aryal J (2016) Big data integration shows Australian bush-fire frequency is increasing significantly. R Soc Open Sci 3(2):150241
Elhami-Khorasani N, Ebrahimian H, Buja L et al (2022) Conceptualizing a probabilistic risk and loss assessment framework for wildfires. Nat Hazards 114:1153–1169. https://doi.org/10.1007/s11069-022-05472-y
Famiglietti C, Holtzman N, Campolo J (2018) Satellite-based prediction of fire risk in Northern California. Stanford University, Final report
Franklin J (1998) Predicting the distribution of shrub species in southern California from climate and terrain-derived variables. J Veg Sci 9:733–748. https://doi.org/10.2307/3237291
Frontline Wildfire Defense, California Wildfires History & Statistics. https://www.frontlinewildfire.com/wildfire-news-and-resources/california-wildfires-history-statistics/. Accessed 11 May 2023
Ghorbanzadeh O, Kamran KV, Blaschke T et al (2019) Spatial prediction of wildfire susceptibility using field survey gps data and machine learning approaches. Fire 2:1–23. https://doi.org/10.3390/fire2030043
Hoover K, Hanson LA (2021) Wildfire Statistics. Congr Res Serv. IF10244, version 51 (3 p.)
Hu FS, Higuera PE, Walsh JE et al (2010) Tundra burning in Alaska: linkages to climatic change and sea ice retreat. J Geophys Res Biogeosci 115:1–8. https://doi.org/10.1029/2009JG001270
Jaque Castillo E, Fernández A, Fuentes Robles R, Ojeda CG (2021) Data-based wildfire risk model for Mediterranean ecosystems: case study of the Concepción metropolitan area in central Chile. Nat Hazards Earth Syst Sci 21:3663–3678. https://doi.org/10.5194/nhess-21-3663-2021
Jiang T, Bendre SK, Lyu H, Luo J (2021) From static to dynamic prediction: wildfire risk assessment based on multiple environmental factors. In: IEEE international conference on big data (big data). IEEE, 2021, pp 4877–4886
Kalies EL, Yocom Kent LL (2016) Tamm review: are fuel treatments effective at achieving ecological and social objectives? A systematic review. For Ecol Manag 375:84–95. https://doi.org/10.1016/j.foreco.2016.05.021
Keeley JE, Syphard AD (2018) Historical patterns of wildfire ignition sources in California ecosystems. Int J Wildl Fire 27:781–799. https://doi.org/10.1071/WF18026
Kopecký M, Macek M, Wild J (2021) Topographic Wetness Index calculation guidelines based on measured soil moisture and plant species composition. Sci Total Environ 757:143785. https://doi.org/10.1016/j.scitotenv.2020.143785
Lall S, Mathibela B (2016) The application of artificial neural networks for wildfire risk prediction. In: International conference on robotics and automation for humanitarian applications (RAHA). IEEE, 2016, pp 1–6. https://doi.org/10.1109/RAHA.2016.7931880
Langford Z, Kumar J, Hoffman F (2018) Wildfire mapping in interior alaska using deep neural networks on imbalanced datasets. In: IEEE international conference on data mining workshops (ICDMW). IEEE, 2018, pp 770–778. https://doi.org/10.1109/ICDMW.2018.00116
Lareau NP, Donohoe A, Roberts M, Ebrahimian H (2022) Tracking wildfires with weather radars. J Geophys Res Atmos 127(11):e2021JD036158
Liu N, Li X, Qi E et al (2020) A novel ensemble learning paradigm for medical diagnosis with imbalanced data. IEEE Access 8:171263–171280
Ma J, Cheng JCP, Jiang F et al (2020) Real-time detection of wildfire risk caused by powerline vegetation faults using advanced machine learning techniques. Adv Eng Inform 44:101070. https://doi.org/10.1016/j.aei.2020.101070
Malik A, Rao MR, Puppala N et al (2021) Data-driven wildfire risk prediction in northern california. Atmosphere (Basel). https://doi.org/10.3390/ATMOS12010109
Maniatis Y, Doganis A, Chatzigeorgiadis M (2022) Fire risk probability mapping using machine learning tools and multi-criteria decision analysis in the gis environment: a case study in the National Park Forest Dadia-Lefkimi-Soufli, Greece. Appl Sci 12:2938. https://doi.org/10.3390/app12062938
Masoudvaziri N, Bardales FS, Keskin OK, Sarreshtehdari A, Sun K, Elhami-Khorasani N (2021) Streamlined wildland-urban interface fire tracing (SWUIFT): modeling wildfire spread in communities. Environ Model Softw 143:105097
Mitchell H, Gwynne S, Ronchi E, Kalogeropoulos N, Rein G (2023) Integrating wildfire spread and evacuation times to design safe triggers: application to two rural communities using PERIL model. Saf Sci 157:105914
National Fire Protection Association (2022) The Relationship Between Extreme Heat and Wildfire. YouTube, uploaded by National Fire Protection Association, Sep 15, 2022. https://www.youtube.com/watch?v=yHHF6K7ReGE
Nezhad MM, Heydari A, Fusilli L, Laneve G (2019) Land cover classification by using Sentinel-2 images: a case study in the City of Rome. In: Proceedings of the the 4th world congress on civil, structural, and environmental engineering, Rome, Italy. https://doi.org/10.11159/iceptp19.158
Oliveira S, Oehler F, San-Miguel-Ayanz J et al (2012) Modeling spatial patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random Forest. For Ecol Manag 275:117–129. https://doi.org/10.1016/j.foreco.2012.03.003
Pausas JG, Keeley JE (2021) Wildfires and global change. Front Ecol Environ 19(7):387–395
Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199. https://doi.org/10.1007/s10021-005-0054-1
Preisler HK, Brillinger DR, Burgan RE, Benoit JW (2004) Probability based models for estimation of wildfire risk. Int J Wildl Fire 13:133–142. https://doi.org/10.1071/WF02061
Qiu L, Chen J, Fan L et al (2022) High-resolution mapping of wildfire drivers in California based on machine learning. Sci Total Environ 833:155155. https://doi.org/10.1016/j.scitotenv.2022.155155
Radeloff VC, Helmers DP, Anu Kramer H et al (2018) Rapid growth of the US wildland-urban interface raises wildfire risk. Proc Natl Acad Sci U S A 115:3314–3319. https://doi.org/10.1073/pnas.1718850115
Riordan EC, Rundel PW (2014) Land use compounds habitat losses under projected climate change in a threatened California ecosystem. PLoS ONE 9:e86487
Rochoux MC, Ricci S, Lucor D et al (2014) Towards predictive data-driven simulations of wildfire spread—part I: reduced-cost ensemble Kalman filter based on a polynomial chaos surrogate model for parameter estimation. Nat Hazards Earth Syst Sci 14:2951–2973. https://doi.org/10.5194/nhess-14-2951-2014
Rochoux MC, Emery C, Ricci S et al (2015) Towards predictive data-driven simulations of wildfire spread—part II: ensemble Kalman Filter for the state estimation of a front-tracking simulator of wildfire spread. Nat Hazards Earth Syst Sci 15:1721–1739. https://doi.org/10.5194/nhess-15-1721-2015
Rodrigues M, De Riva J (2014) Environmental Modelling & Software An insight into machine-learning algorithms to model human-caused wild fire occurrence. Environ Model Softw 57:192–201. https://doi.org/10.1016/j.envsoft.2014.03.003
Romero-Calcerrada R, Novillo CJ, Millington JDA, Gomez-Jimenez I (2008) GIS analysis of spatial patterns of human-caused wildfire ignition risk in the SW of Madrid (Central Spain). Landsc Ecol 23:341–354. https://doi.org/10.1007/s10980-008-9190-2
Salehi M, Rusu LI, Lynar T, Phan A (2016) Dynamic and robust wildfire risk prediction system: an unsupervised approach. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. pp 245–254. https://doi.org/10.1145/2939672.2939685
Sanstad AH, Johnson H, Goldstein N, Franco G (2011) Projecting long-run socioeconomic and demographic trends in California under the SRES A2 and B1 scenarios. Clim Change 109:21–42
Saxena S, Dubey RR, Yaghoobian N (2023) A planning model for predicting ignition potential of complex fuels in diurnally variable environments. Fire Technol 59:2787–2827
Sayad YO, Mousannif H, Al Moatassime H (2019) Predictive modeling of wildfires: a new dataset and machine learning approach. Fire Saf J 104:130–146. https://doi.org/10.1016/j.firesaf.2019.01.006
Sebastián-López A, Salvador-Civil R, Gonzalo-Jiménez J, SanMiguel-Ayanz J (2008) Integration of socio-economic and environmental variables for modelling long-term fire danger in Southern Europe. Eur J for Res 127:149–163. https://doi.org/10.1007/s10342-007-0191-5
Sleeter BM, Wilson TS, Sharygin E, Sherba JT (2017) Future scenarios of land change based on empirical data and demographic trends. Earth’s Future 5:1068–1083
Spruce JP, Gasser GE, Hargrove WW (2019) MODIS NDVI data, smoothed and gap-filled, for the conterminous US: 2000–2015. ORNL DAAC, Oak Ridge
Sun Z, Song Q, Zhu X et al (2015) A novel ensemble method for classifying imbalanced data. Pattern Recognit 48:1623–1637. https://doi.org/10.1016/j.patcog.2014.11.014
Syphard AD, Radeloff VC, Keuler NS et al (2008) Predicting spatial patterns of fire on a southern California landscape. Int J Wildl Fire 17:602–613. https://doi.org/10.1071/WF07087
Tomar JS, Kranjčić N, Đurin B et al (2021) Forest fire hazards vulnerability and risk assessment in Sirmaur district forest of Himachal Pradesh (India): a geospatial approach. ISPRS Int J Geo-Inf 10:447. https://doi.org/10.3390/ijgi10070447
Touma D, Stevenson S, Lehner F et al (2021) Human-driven greenhouse gas and aerosol emissions cause distinct regional impacts on extreme fire weather. Nat Commun 12(1):212
Ujjwal KC, Hilton J, Garg S, Aryal J (2021) A probability-based risk metric for operational wildfire risk management. Environ Model Softw 148:105286. https://doi.org/10.1016/j.envsoft.2021.105286
Vluymans S (2019) Learning from imbalanced data. Dealing with Imbalanced and weakly labelled data in machine learning using fuzzy and rough set methods. Springer, Berlin, pp 81–110. https://doi.org/10.1007/978-3-030-04663-7_4
Wang M, Wang H, Wang J et al (2019) A novel model for malaria prediction based on ensemble algorithms. PLoS ONE 14:1–15. https://doi.org/10.1371/journal.pone.0226910
WFIGS - Wildland Fire Locations Full History. https://data-nifc.opendata.arcgis.com/datasets/nifc::wfigs-wildland-fire-locations-full-history/about. Accessed 07 Oct 2022
Yu L, Zhou R, Tang L, Chen R (2018) A DBN-based resampling SVM ensemble learning paradigm for credit classification with imbalanced data. Appl Soft Comput J 69:192–202. https://doi.org/10.1016/j.asoc.2018.04.049
Yu T, Zhu H (2020) Hyper-parameter optimization: a review of algorithms and applications. arXiv Prepr arXiv:2003.05689
Zhao L, Yebra M, van Dijk AIJM et al (2021) The influence of soil moisture on surface and sub-surface litter fuel moisture simulation at five Australian sites. Agric For Meteorol 298–299:108282. https://doi.org/10.1016/j.agrformet.2020.108282
Funding
Financial support for this project has been provided by Johns Hopkins University via Professor Gernay’s faculty startup fund.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Data collection, coding, and analysis were performed by QT. The first draft of the manuscript was written by QT. TG supervised the work and commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tong, Q., Gernay, T. Mapping wildfire ignition probability and predictor sensitivity with ensemble-based machine learning. Nat Hazards 119, 1551–1582 (2023). https://doi.org/10.1007/s11069-023-06172-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11069-023-06172-x