Prediction of Forest Fire Risk for Artillery Military Training using Weighted Support Vector Machine for Imbalanced Data

Nam, Ji Hyun; Mun, Jongmin; Jo, Seongil; Kim, Jaeoh

doi:10.1007/s00357-024-09467-1

Prediction of Forest Fire Risk for Artillery Military Training using Weighted Support Vector Machine for Imbalanced Data

Published: 04 March 2024

Volume 41, pages 170–189, (2024)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Ji Hyun Nam¹,
Jongmin Mun²,
Seongil Jo¹ &
…
Jaeoh Kim ORCID: orcid.org/0000-0001-7831-6353¹

135 Accesses
Explore all metrics

Abstract

Since the 1953 truce, the Republic of Korea Army (ROKA) has regularly conducted artillery training, posing a risk of wildfires — a threat to both the environment and the public perception of national defense. To assess this risk and aid decision-making within the ROKA, we built a predictive model of wildfires triggered by artillery training. To this end, we combined the ROKA dataset with meteorological database. Given the infrequent occurrence of wildfires (imbalance ratio \(\approx \) 1:24 in our dataset), achieving balanced detection of wildfire occurrences and non-occurrences is challenging. Our approach combines a weighted support vector machine with a Gaussian mixture-based oversampling, effectively penalizing misclassification of the wildfires. Applied to our dataset, our method outperforms traditional algorithms (G-mean=0.864, sensitivity=0.956, specificity= 0.781), indicating balanced detection. This study not only helps reduce wildfires during artillery trainings but also provides a practical wildfire prediction method for similar climates worldwide.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mapping wildfire ignition probability and predictor sensitivity with ensemble-based machine learning

Article 12 September 2023

Classifying bridges for the risk of fire hazard via competitive machine learning

Article Open access 05 January 2021

Fire Risk Prediction Using Building Information and Machine Learning Methods

Data Availability

While some of the data employed in the study is accessible at the following URL: https://github.com/jihyun-nam/Prediction-of-Forest-Fire-Risk, it is important to note that obtaining the complete dataset necessitates permission from the Republic of Korea Army. Therefore, kindly request the author to acquire the necessary permissions for the entire dataset.

Code Availability

The codes utilized in this paper can be accessed through the following URL: https://github.com/jihyun-nam/Prediction-of-Forest-Fire-Risk.

References

Ahmadlou, M., Karimi, M., & Pontius, R. G., Jr. (2022). A new framework to deal with the class imbalance problem in urban gain modeling based on clustering and ensemble models. Geocarto International, 37(19), 5669–5692.
Article Google Scholar
Ahmadlou, M., Karimi, M., & Al-Ansari, N. (2023). The use of maximum entropy and ecological niche factor analysis to decrease uncertainties in samples for urban gain models. GIScience & Remote Sensing, 60(1), 2222980.
Article Google Scholar
Akbani, R., Kwek, S., & Japkowicz, N. (2004). Applying Support Vector Machines to Imbalanced Datasets. In J.-F. Boulicaut, F. Esposito, F. Giannotti, & D. Pedreschi (Eds.), Machine Learning: ECML 2004 (pp. 39–50). Lecture Notes in Computer Science: Springer, Berlin, Heidelberg.
Al-Fugara, A., Mabdeh, A. N., Ahmadlou, M., Pourghasemi, H. R., Al-Adamat, R., Pradhan, B., & Al-Shabeeb, A. R. (2021). Wildland fire susceptibility mapping using support vector regression and adaptive neuro-fuzzy inference system-based whale optimization algorithm and simulated annealing. ISPRS International Journal of Geo-Information, 10(6), 382.
Article Google Scholar
Anand, R., Mehrotra, K., Mohan, C., & Ranka, S. (1993). An improved algorithm for neural network classification of imbalanced training sets. IEEE Transactions on Neural Networks, 4(6), 962–969.
Article Google Scholar
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern Information Retrieval, (1st ed.). Harlow: Addison Wesley.
Bang, S., & Jhun, M. (2014). Weighted support vector machine using k-Means clustering. Communications in Statistics - Simulation and Computation, 43(10), 2307–2324.
Article MathSciNet Google Scholar
Barandela, R., Valdovinos, R. M., Sánchez, J. S., & Ferri, F. J. (2004). The imbalanced training sample problem: Under or over sampling? In A. Fred, T. M. Caelli, R. P. W. Duin, A. C. Campilho, & D. de Ridder (Eds.), Structural, Syntactic, and Statistical Pattern Recognition (pp. 806–814). Lecture Notes in Computer Science: Springer, Berlin, Heidelberg.
Barnes, S. L. (1964). A technique for maximizing details in numerical weather map analysis. Journal of Applied Meteorology and Climatology, 3(4), 396–409.
Article Google Scholar
Beckmann, M., Ebecken, N., & Lima, B. (2015). A KNN undersampling approach for data balancing. Journal of Intelligent Learning Systems and Applications, 7, 104–116.
Article Google Scholar
Bekkar, M., Djemaa, H. K., & Alitouche, T. A. (2013). Evaluation measures for models assessment over imbalanced data sets. Journal of Information Engineering and Applications, 3(10), 27.
Google Scholar
Belloi, A. P., Campesi, S., Nieddu, C., Tola, F., Deiana, S., Zizi, M., Muntoni, G., Tesei, G., Delitala, A., & Dessy, C. (2022). Strategies and measures for wildfire risk mitigation in the mediterranean area: The MED-Star project. Environmental Sciences Proceedings, 17(1), 124.
Blagus, R., & Lusa, L. (2013). SMOTE for high-dimensional class-imbalanced data. BMC Bioinformatics, 14, 106.
Article Google Scholar
Bunkhumpornpat, C., Sinapiromsaran, K., & Lursinsap, C. (2009). Safe-Level-SMOTE: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In T. Theeramunkong, B. Kijsirikul, N. Cercone, & T.-B. Ho (Eds.), Advances in knowledge discovery and data mining (pp. 475–482). Lecture Notes in Computer Science: Springer, Berlin, Heidelberg.
Chawla, N., Lazarevic, A., Hall, L., & Bowyer, K. (2003). SMOTEBoost: Improving prediction of the minority class in boosting. In: Proceedings of the 7th European conference on principles and practice of knowledge discovery in database (vol. 2838, pp. 107–119)
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16(1), 321–357.
Article Google Scholar
Chawla, N. V., Japkowicz, N., & Kotcz, A. (2004). Editorial: Special issue on learning from imbalanced data sets. ACM SIGKDD Explorations Newsletter, 6(1), 1–6.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
Article Google Scholar
Cox, T. F., & Cox, M. A. A. (2000). Multidimensional scaling (2nd ed.). CRC Press.
Book Google Scholar
Crowley, G., Kwon, S., Ostrofsky, D. F., Clementi, E. A., Haider, S. H., Caraher, E. J., Lam, R., St-Jules, D. E., Liu, M., Prezant, D. J., & Nolan, A. (2019). Assessing the protective metabolome using machine learning in world trade center particulate exposed firefighters at risk for lung injury. Scientific Reports, 9(1), 11939.
Article Google Scholar
Debnath, T., & Nakamoto, T. (2022). Predicting individual perceptual scent impression from imbalanced dataset using mass spectrum of odorant molecules. Scientific Reports, 12(1), 3778.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–22.
MathSciNet Google Scholar
Drummond, C., & Holte, R. C. (2006). Cost curves: An improved method for visualizing classifier performance. Machine Learning, 65(1), 95–130.
Article Google Scholar
Fernández, A., del Jesus, M. J., & Herrera, F. (2009). Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets. International Journal of Approximate Reasoning, 50(3), 561–577.
Article Google Scholar
Gao, M., Hong, X., Chen, S., & Harris, C. J. (2011). A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems. Neurocomputing, 74(17), 3456–3466.
Article Google Scholar
Gao, S., & Li, S. (2022). Bloody Mahjong playing strategy based on the integration of deep learning and XGBoost. CAAI Transactions on Intelligence Technology, 7(1), 95–106.
Article Google Scholar
Gasparin, A., Lukovic, S., & Alippi, C. (2022). Deep learning for time series forecasting: The electric load case. CAAI Transactions on Intelligence Technology, 7(1), 1–25.
Article Google Scholar
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge, Massachusetts: The MIT Press.
Google Scholar
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., & Bing, G. (2017). Learning from class-imbalanced data: Review of methods and applications. Expert Systems with Applications, 73, 220–239.
Article Google Scholar
Halofsky, J. E., Peterson, D. L., & Harvey, B. J. (2020). Changing wildfire, changing forests: The effects of climate change on fire regimes and vegetation in the Pacific Northwest, USA. Fire Ecology, 16(1), 4.
Article Google Scholar
Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. SpringerIn D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, D.-S. Huang, X.-P. Zhang, & G.-B. Huang (Eds.), Advances in intelligent computing (Vol. 3644, pp. 878–887). Berlin, Heidelberg: Berlin Heidelberg.
Chapter Google Scholar
Hand, D. J. (2009). Measuring classifier performance: A coherent alternative to the area under the ROC curve. Machine Learning, 77(1), 103–123.
Article Google Scholar
He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.
Article Google Scholar
Jafari Goldarag, Y., Mohammadzadeh, A., & Ardakani, A. S. (2016). Fire risk assessment using neural network and logistic regression. Journal of the Indian Society of Remote Sensing, 44(6), 885–894.
Article Google Scholar
Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent Data Analysis, 6(5), 429–449.
Article Google Scholar
Jiao, Z., Zhang., Y., Xin, J., Mu, L., Yi, Y., Liu, H., & Liu, D. (2019). A deep learning based forest fire detection approach using UAV and YOLOv3. In 2019 1st international conference on industrial artificial intelligence (IAI) (pp. 1–5)
Khalilia, M., Chakraborty, S., & Popescu, M. (2011). Predicting disease risks from highly imbalanced data using random forest. BMC Medical Informatics and Decision Making, 11(1), 51.
Article Google Scholar
Kim, S., Lee, W., Park, Y.-s., Lee, H.-W., & Lee, Y.-T. (2016). Forest fire monitoring system based on aerial image. In 2016 3rd International conference on information and communication technologies for disaster management (ICT-DM) (pp. 1–6)
Kloprogge, P., van der Sluijs, J. P., & Petersen, A. C. (2011). A method for the analysis of assumptions in model-based environmental assessments. Environmental Modelling & Software, 26(3), 289–301.
Article Google Scholar
Koziarski, M. (2021). CSMOUTE: Combined synthetic oversampling and undersampling technique for imbalanced data classification. In 2021 International joint conference on neural networks (IJCNN) (pp. 1–8)
Krawczyk, B. (2016). Learning from imbalanced data: Open challenges and future directions. Progress in Artificial Intelligence, 5(4), 221–232.
Article Google Scholar
Krueger, E. S., Levi, M. R., Achieng, K. O., Bolten, J. D., Carlson, J. D., Coops, N. C., Holden, Z. A., Magi, B. I., Rigden, A. J., & Ochsner, T. E. (2022). Using soil moisture information to better understand and predict wildfire danger: A review of recent developments and outstanding questions. International Journal of Wildland Fire, 32(2), 111–132.
Article Google Scholar
Kubát, M, & Matwin, S (1997) Addressing the curse of imbalanced training sets: One-sided selection. In International conference on machine learning
Liu, X.-Y., Wu, J., & Zhou, Z.-H. (2009). Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(2), 539–550
López, V., Fernández, A., García, S., Palade, V., & Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250, 113–141.
Article Google Scholar
Mani, I. (2003). KNN approach to unbalanced data distributions: A case study involving information extraction. In Proceedings of workshop on learning from imbalanced datasets
Mease, D., Wyner, A. J., & Buja, A. (2007). Boosted classification trees and class probability/quantile estimation. Journal of Machine Learning Research, 8(16), 409–439.
Google Scholar
Ngoc Thach, N., Bao-Toan Ngo, D., Xuan-Canh, P., Hong-Thi, N., Hang Thi, B., Nhat-Duc, H., & Dieu, T. B. (2018). Spatial pattern assessment of tropical forest fire danger at Thuan Chau area (Vietnam) using GIS-based advanced machine learning algorithms: A comparative study. Ecological Informatics, 46, 74–85.
Article Google Scholar
Prati, R. C., Batista, G. E. A. P. A., & Monard, M. C. (2011). A survey on graphical methods for classification predictive performance evaluation. IEEE Transactions on Knowledge and Data Engineering, 23(11), 1601–1618.
Article Google Scholar
Ramentol, E., Verbiest, N., Bello, R., Caballero, Y., Cornelis, C., & Herrera, F. (2012). SMOTE-FRST: A new resampling method using fuzzy rough set theory. In Uncertainty modeling in knowledge engineering and decision making, world scientific proceedings series on computer engineering and information science (Vol. 7, WORLD SCIENTIFIC, pp. 800–805)
Rodrigues, M., & de la Riva, J. (2014). An insight into machine-learning algorithms to model human-caused wildfire occurrence. Environmental Modelling & Software, 57, 192–201.
Article Google Scholar
Shamsudin, H., Yusof, U. K., Jayalakshmi, A., & Akmal Khalid, M. N. (2020). Combining oversampling and undersampling techniques for imbalanced classification: A comparative study using credit card fraudulent transaction dataset. In 2020 IEEE 16th international conference on control & automation (ICCA) (pp. 803–808)
Shaw, J. D., Goeking, S. A., Menlove, J., & Werstak, C. E., Jr. (2017). Assessment of fire effects based on forest inventory and analysis data and a long-term fire mapping data set. Journal of Forestry, 115(4), 258–269.
Article Google Scholar
Stocks, B. J., Lawson, B. D., Alexander, M. E., Wagner, C. E. V., McAlpine, R. S., Lynham, T. J., & Dubé, D. E. (1989). The Canadian forest fire danger rating system: An overview. The Forestry Chronicle, 65(6), 450–457.
Article Google Scholar
Sun, Y., Kamel, M. S., Wong, A. K. C., & Wang, Y. (2007). Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition, 40(12), 3358–3378.
Article Google Scholar
Tomek, I. (1976). Two modifications of CNN. IEEE Transactions on Systems, Man, and Cybernetics SMC-6 (11), 769–772
United States Department of Agriculture (2015) FARSITE: Fire Area Simulator - Model Development and Evaluation. CreateSpace Independent Publishing Platform
Van Wagner, C. E. (1987). Development and structure of the Canadian forest fire weather index system. Forestry Technical Report, 35, 35.
Google Scholar
Veropoulos, K., Campbell, C., & Cristianini, N. (1999). Controlling the sensitivity of support vector machines. In Proceedings of the international joint conference on AI, Stockholm (Vol. 55, pp 60)
Walter, S. D. (2005). The partial area under the summary ROC curve. Statistics in Medicine, 24(13), 2025–2040.
Article MathSciNet Google Scholar
Winkler, R. L. (1969). Scoring rules and the evaluation of probability assessors. Journal of the American Statistical Association, 64(327), 1073–1078, 2283486
Xu, R., Lin, H., Lu, K., Cao, L., & Liu, Y. (2021). A forest fire detection system based on ensemble learning. Forests, 12(2), 217.
Article Google Scholar
Yu, Y., Mao, J., Wullschleger, S. D., Chen, A., Shi, X., Wang, Y., Hoffman, F. M., Zhang, Y., & Pierce, E. (2022). Machine learning-based observation-constrained projections reveal elevated global socioeconomic risks from wildfire. Nature Communications, 13(1), 1250.
Article Google Scholar
Zhang, Q., Xiao, J., Tian, C., Chun-Wei Lin, J., & Zhang, S. (2023). A robust deformed convolutional neural network (CNN) for image denoising. CAAI Transactions on Intelligence Technology, 8(2), 331–342.
Article Google Scholar
Zhao, X.-M., Li, X., Chen, L., & Aihara, K. (2008). Protein classification with imbalanced data. Proteins: Structure, Function, and Bioinformatics, 70(4), 1125–1132

Download references

Funding

Ji Hyun Nam was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government Ministry of Science and ICT (MSIT) (RS-2023-00278691). Seongil Jo was supported by Basic Science Research Program through the NRF funded by the Korea government (MSIT) (RS-2023-00209229). Jaeoh Kim was supported by the NRF Grant through the Korea Government (MSIT) under Grant NRF-2022R1A5A7033499.

Author information

Authors and Affiliations

Department of Statistics and Data Science, Inha University, Incheon, South Korea
Ji Hyun Nam, Seongil Jo & Jaeoh Kim
Marshall School of Business, University of Southern California, Los Angeles, USA
Jongmin Mun

Authors

Ji Hyun Nam
View author publications
You can also search for this author in PubMed Google Scholar
Jongmin Mun
View author publications
You can also search for this author in PubMed Google Scholar
Seongil Jo
View author publications
You can also search for this author in PubMed Google Scholar
Jaeoh Kim
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.H. Nam and J. Mun contributed equally to this work, while both S. Jo and J. Kim supervised the findings as corresponding authors.

Corresponding author

Correspondence to Jaeoh Kim.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Ji Hyun Nam and Jongmin Mun are the first authors of this paper. Jaeoh Kim is the corresponding author and Seongil Jo is the Co-corresponding author of this paper.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Nam, J.H., Mun, J., Jo, S. et al. Prediction of Forest Fire Risk for Artillery Military Training using Weighted Support Vector Machine for Imbalanced Data. J Classif 41, 170–189 (2024). https://doi.org/10.1007/s00357-024-09467-1

Download citation

Accepted: 13 February 2024
Published: 04 March 2024
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00357-024-09467-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prediction of Forest Fire Risk for Artillery Military Training using Weighted Support Vector Machine for Imbalanced Data

Abstract

Access this article

Similar content being viewed by others

Mapping wildfire ignition probability and predictor sensitivity with ensemble-based machine learning

Classifying bridges for the risk of fire hazard via competitive machine learning

Fire Risk Prediction Using Building Information and Machine Learning Methods

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Prediction of Forest Fire Risk for Artillery Military Training using Weighted Support Vector Machine for Imbalanced Data

Abstract

Access this article

Similar content being viewed by others

Mapping wildfire ignition probability and predictor sensitivity with ensemble-based machine learning

Classifying bridges for the risk of fire hazard via competitive machine learning

Fire Risk Prediction Using Building Information and Machine Learning Methods

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation