Applying Machine Learning for Firebrand Production Prediction

This article presents a machine learning (ML) based metamodeling framework for firebrand production prediction. This framework was implemented to predict the firebrand areal mass density (FAMD) and firebrand areal number density (FAND) of landing firebrands using a large set of data from full-scale laboratory firebrand production experiments. The independent variables used in our ML models to predict the dependent variables FAND and FAMD were landing (or travel) distance, wind speed, and fuel type (structural and vegetative fuels). It was demonstrated that the non-linear non-parametric ML model, K-nearest neighbors (KNN), works the best for this purpose. The KNN model predicted discrete FAND and FAMD values with an accuracy higher than 90%. The current ML model can be used to predict locations with high risk of spotting ignition potential. This research is a small step towards the bigger goal of creating a numerical firebrand production simulator.


Introduction
Fire modeling is the numerical simulation of fires with the purpose of reproducing or predicting fire behavior. The calculation work in fire modeling is usually performed using computers to consider fuel characteristics, fire dynamics, and environmental conditions. Fire modeling includes the simulation of fires in an enclosure (such as fires in a room or building) and fires in the open [such as open wildfires or wildland-urban interface (WUI) fires]. As wildfires are occurring more frequently and more destructive in fire-prone areas, the ability to predict wildfire development and structure ignition can greatly aid our effort in wildfire prevention, suppression, and damage mitigation. Proper use of wildfire modeling can also help us protect regional ecosystems and air quality. Available fire models can be classified into three categories: empirical, semi-empirical, and physics-based [1][2][3]. Proper fire modeling requires the right balance among model fidelity, data availability, and execution speed. *Correspondence should be addressed to: Aixi Zhou, E-mail: azhou@ncat.edu Fire Technology, 58, 3261-3290, 2022 Ó 2022 The Author(s) Manufactured in The United States https://doi.org/10.1007/s10694-022-01309-z Available fire models do not adequately model the role of firebrands in ignition and fire spread. For example, for wildfires and WUI fires, it is desirable that the fire model considers all the three mechanisms of ignition and spread: direct flame contact, radiation heat, and firebrands (or embers). Firebrands are the numberone cause of structure ignition and destruction in the WUI [4][5][6][7][8][9]. Firebrands are also a major source of wildfire spread as they can pass fire barriers or firebreaks and start a new fire in a remote area, thus increase the probability of unpredictable fire growth. However, fire modeling using available models (such as physics-based fire models) to consider the firebrand spotting process (including firebrand production, lofting, transportation, landing, and ignition of recipient fuels) can cause large computational burden. Modeling firebrand spotting needs to consider the physical properties of firebrands, the environmental conditions, and fuel properties. To attain a computational feasibility of the fire model while maintaining a high accuracy without quantifying each uncertainty, metamodeling is an excellent choice [5]. Instead of following physics-based equation and uncertainties while predicting a fire, a metamodel approach consists of learning patterns from an existing labeled data generated to replicate real fire scenario. Fire meta-models have the potential of incorporating the firebrand phenomenon into the models with manageable computational cost. When properly trained and implemented, metamodels can provide high accuracy fire modeling (including wildfire modeling) real time.
Recently, metamodels via statistical approaches and machine learning (ML) algorithms have been applied to fire modeling [10][11][12][13][14][15], including ML decision treebased model developed for classifying the final size of fire with an accuracy of 50.4% (AE5:2%). Vapor pressure deficit (VPD) and spruce function were used as the significant independent variables for the prediction model in [16] for the prediction of final fire size, resulting in a misclassification rate of 49.6%. Another similar research on ML model to predict the spread of forest fire used topography, fuel characteristics, road networks, and fire suppression efforts, as the significant independent variables to predict the final wildland fire perimeter [10]. This model was based on boosted logistic regression with an accuracy of 69%. The 31% of misclassification is probably due to the variability in the significant independent variables considered in the prediction model. This misclassification rate can be reduced as we will demonstrate in our study by using proper parameters as significant prediction variables.
The validity of ML models over traditional algebraic models was previously demonstrated by research on different fire hazard models compared to non-linear ML models [5,11,15,[17][18][19]. The accuracy of the ML prediction model was within 10% of the experimental value. Thus, ML prediction model is considered as good as the traditional algebraic models [i.e., consolidated model of fire and smoke transport (CFAST) and fire dynamics simulator (FDS)] used in the study [5]. The efficiency of non-linear ML model was validated in 25 different fire hazard simulations generated using CFAST. The non-linear ML model, K-nearest neighbor (KNN), gave the best results with an error range 10% of the correct temperature value. The KNN model was developed to replicate CFAST modeling for faster prediction of nuclear power plant fires, and boosted logistic regression based model was developed for classifying the final fire perimeter locations of wildland fires [11,15].
The goal of this study is to develop an ML metamodeling framework for firebrand production prediction. When applied to firebrand production, the ML meta-modeling framework can provide a better understanding of the statistical distribution of firebrand properties (e.g., firebrand size, mass, and flying distance) and the accumulation of firebrands in interested areas. This can aid in developing proper policies, codes, standards, strategies, and guidelines that address the firebrand spotting problem in fires (including wildfires and WUI fires). This work can also help further develop numerical firebrand generators that can be used in applications that assist fire modeling, fire risk assessment, and fire experiments (especially experiments involving firebrands). For example, in firebrand experiment, the framework and the numerical firebrand generator can assist in experimental designs that will maximize their utilization in metamodeling while minimizing cost and the amount of work, such as selecting proper fuel types (structural and vegetative) and testing environmental conditions (wind, humidity, temperature), determining requirements for data collection (e.g., in 1D, 2D, or 3D) and measurement (such as minimum number of data set, data accuracy and precision), and estimating the limitations of the experimental data.
This metamodeling framework was implemented to the process of predicting firebrand areal mass density (FAMD) and firebrand areal number density (FAND) of landing firebrands using ML models and firebrand data generated from laboratory firebrand production experiments. This process can be briefly explained as training the ML models with available firebrand data to recognize, extract, and learn the unique patterns, aiming to predict the unknown data by simulation of the previously learned patterns. The ML models will recognize the firebrand data patterns and predict the FAND and FAMD of landing firebrands for an unknown data by simulating the learned patterns. In the followings, the experimental data and the exploratory analysis of the data are first presented. The ML metamodeling framework is then introduced. Results and findings from this metamodeling effort are then presented and discussed.

Source of Data
Many experiments have been conducted to study the firebrand production process [12-15, 20, 21]. The datasets for ML algorithm training and testing (or validation) are from a firebrand production study in [12][13][14][15]. The collected firebrands from other firebrand production experiments using full-scale building components or assemblies varied between 50 and 500. Study [13] collected and measured significantly larger firebrand data than any existing firebrand data sets. The study [13] collected and measured the physical properties of 59,820 firebrands, including 24,149 from structural components (fences, re-entrant corners, and roofs), 26,422 from wall-roof mockup structural assemblies, and 9249 from five wildland vegetative fuels (chamise, saw palmetto, loblolly pine, Leyland Cypress, and little blue-stem grass). In addition, their experimental work was based on a statistics-based framework for sampling and measurement [12]. The firebrand production experiments were full-scale and were performed in a large-scale open-jet wind tunnel. The fluctuating wind speeds were used in the experiments, with nominal speeds of 5.36 m/s (idle), 11.17 m/s (medium), and 17.88 m/s (high). Three physical quantities of firebrands were measured and reported: traveling (or flying) distance, 2D projected area (or size), and mass. The units in the data are projected area in mm 2 , mass in grams, and traveling distance in meters. The independent parameters controlled in the experimental setup were fuel types (structural and vegetative) and wind speed. Readers should refer to the report [13] and data archives [14,15] for details related to the experiments.
The independent factors and their designated values in the ML algorithms are listed in Table 1. For simplicity, the two fence types (privacy and lattice) were put in the same variables group as construction materials and types such as cedar/plywood, cedar/OSB (oriented strand board), recycled rubber, non-FRT (flame retardant treated) shake and FRT shake. The assigned numbers are for identification purpose only.

Firebrand Areal Number Density (FAND) and Firebrand Areal Mass Density (FAMD)
Firebrands are often generated in large numbers and can accumulate in a concentrated area. Here we introduce the parameters FAND and FAMD to address this issue. FAND is defined as the number of landed firebrands per unit area. It is a measure of the concentration of landed firebrands. If the number of the landed firebrands (by default all firebrands, including both burning and extinguished firebrands) is N and the area of landed firebrands is A, then FAND (q N ) is, The higher the FAND, the more firebrands are accumulated in the area. FAND is an important parameter in assessing ignition potential and threat from firebrands. In modeling, FAND can help us understand spotting ignition more accurately and thus help forecast the spread of fire using appropriate metamodeling approach. FAMD is defined as the mass of landed firebrands per unit area. FAMD denotes surface density (area density) of the landed firebrands (both burning and extinguished firebrands). If m denotes mass of the landed firebrands, then FAMD (q m ) is, The quantification of the area of interest in Eqs. 1 and 2 depends on the purpose of the study. For example, in a firebrand production laboratory study, the area can be determined as the pre-defined area where firebrand collection will take place. In a firebrand ignition field study, the area can be determined as the area of a structure (e.g., a house excluding yards), the area of a plot of land, or the area of a built neighborhood. In a study, FAND (or FAMD) can be global or local. In a firebrand production laboratory study, global FAND (or FAMD) is the total number (or mass) of collected firebrands divided by the total firebrand collection area. The local FAND (or FAMD) can be the number (or mass) of collected firebrands in a particular firebrand collection pan divided by the pan's area. Local and global FANDs or FAMDs can be used for different purposes. The area in Eqs. 1 and 2 is different from the projected area in previous studies [12][13][14][15] to represent the size of a firebrand. The projected area, discussed in the following section, is the two-dimensional (2D) projection of a firebrand. In application, threshold values of FAMD and FAND can be introduced to indicate the possibility of spotting ignition or to classify the risk levels from firebrands based on their FAMD and FAND values. Since the FAMD and FAND are new concepts, there are no defined threshold values of FAMD and FAND in the literature for these purposes yet. Thus, in our current study, the median values of the FAND and FAMD generated from the vegetative and structural fuels in the study in [13] are used as the threshold values to classify high risk and low risk FAND and FAMD to demonstrate our framework and the process. FAMD and FAND values below the median values are classified as low risk or ''low FAMD and low FAND'', and FAMD and FAND values higher than the median values are classified as high risk or ''high FAMD and high FAND''. The process will be the same as long as the threshold values of low and high are determined. Further research on threshold values of FAND and FAMD causing spot fire ignition for different types of recipient fuels will help in further refining the ML model developed in this study.
Since the input variables wind speed and travel distance were not continuous, this current study focuses on predicting discrete values of FAND and FAMD (e.g., high and low values). In theory, full-scale wind tunnel experiments measuring firebrand production with continuous wind speed and travel distance are required to predict continuous FAMD and FAND values. In practice, it is unrealistic or impossible to have continuous wind speed and travel distance in firebrand production experiments. However, higher resolutions of wind speed and travel distance in experimental design will improve ML algorithms.

Relationship Between Firebrand Size (Projected Area) and Mass
Firebrand projected area (i.e., size) and its mass are highly correlated. The relationship of firebrand mass (m) and its projected area (a) can be expressed as, The value of coefficient a is generally between 0.5 and 1.0. It is close to 2/3 for firebrands produced from vegetative fuels [22], as these firebrands tend to be 1D. Firebrands from structural fuels are more likely to be 2D [13]. To better understand the correlation of projected area and mass in our data, a log transformation of the variables (area and mass) was performed. The results are shown in Tables 2  and 3. For structural firebrands, the coefficient a varies from 0.63 to 0.77, with an average value of 0.70. For vegetative firebrands, the range of coefficient a varies from 0.59 to 0.91, with an average value of 0.72. The log-log plot of firebrand projected area and its mass has a linear trend with every combination of factors as shown in Figs. 1 and 2, thus, where c is the intercept point on the log-log plot, coefficient b is the slope of the graph and the inverse of coefficient a in Eq. 3, i.e., Thus, Eq. 4 can be rewritten as, Equation 7 is important and useful in practice, especially in firebrand experiments or post-fire fire investigations. The advancement of digital photography and computer vision has enabled us to capture images of firebrands in flight (even in realtime) and after landing. Using appropriate algorithm [12][13][14][15], one can obtain the projected area of individual firebrands in these images at a high level of accuracy. On the contrary, measuring firebrand mass is a labor-intensive process. One must weight each brand individually to get its mass. Furthermore, since firebrands are fragile, handling the firebrand during the weighting process may break the firebrand and lead to errors in measurement and uncertainties in the whole set of firebrand data. With the projected area determined from digital imaging analysis or computer vision, Eq. 7 can then be used to calculate the mass of the corresponding firebrand. The accuracy of the calculation is mainly affected by factors g and a (or b). More accurate g and a (or b) values will help fine tune the ML model and can predict firebrand mass with higher accuracy.

Independent Variables
The purpose of the ML models is to predict FAND and FAMD to estimate the chances of spot ignition occurrence since high FAND and FAMD values are directly proportional to the high probability of causing spot ignition in a wildfire or ignition of structures [23,24]. The distance travelled and wind speed can be measured along with the identification of the type of fuel of the source of firebrands generating at the WUI fire, this enables the prediction of FAND and FAMD in real time. We will train the ML models with the independent variables and compare the accuracies after determining the importance of the predicting variables. Firebrand mass and projected area are dropped from the independent variables for two reasons. The first is related to ML algorithms, as the high correlation between projected area and mass can cause high instability in the ML models as the adjustment of the coefficient for one predicting variable will impact the coefficient for the dependent predicting variable. This statement implies the need for full independence. The second reason is that firebrand mass and projected area are measured values after firebrands are generated. With the removal of mass and projected area, the remaining independent predicting variables are easy to obtain in a wildland or WUI fire and will not cause any variance inflation. The feature reduction allows us to use minimum possible information while maintaining high model accuracy. The independent variables used in our ML models to predict the

Hypothesis
In addition to analyzing the impact of different fuels on the spot ignition in a WUI fire, this research will analyze two hypothesis which are as follows: (1) There are more chances of spot ignition due to firebrands in a high wind speed, when compared to low wind speed. This is because the wind aids in combustion with the oxygen supply which can generate more firebrands and increase the risk of spot ignition due to landing firebrands. (2) Among the three vegetation types (given the same weight), tree creates higher FAND and FAMD when compared to the other vegetation types.

Finding Linear and Non-linear Patterns in Data Separation
The data used to predict FAND and FAMD has five dimensions which is difficult to imagine and interpret directly. To overcome this problem, we used scatter plot matrix which helps in the visualization of the data separation in 2D for each independent variable pair to decide if the binary data can be separated with linear or non-linear ML model. The histograms (see Figs. 9, 10, 11) give some interesting insights which were not detected in the scatter plot matrix. While scatter plot matrix shows that the independent variables cannot separate high and low FAND and FAMD, histograms confirmed that the data points are overlapping and thus the separation is not linearly possible. Therefore, we need non-linear ML models to analyze further with the prediction. Due to the complexity of fire dynamics in a fire, a non-parametric ML model would generate a higher accuracy compared to a parametric model since non-parametric ML models do not make any strong assumptions for the data while creating their own function and learn the patterns in the data structure. Thus, non-parametric ML models are open for further modification in the prediction function with a possible addition of new data points in the future.

Machine Learning Framework
The ML framework used in the prediction process is illustrated in Fig. 3. The process starts with the randomization of available data. Since collinearity among the input variables was fixed and unimportant independent variables were eliminated in the exploratory data analysis, we will focus on developing non-linear non-parametric ML models. We choose to develop supervised non-linear non-parametric machine learning model as the dynamics of fire behavior is nonlinear and has various uncertainties like temperature, humidity, weather, etc. A ML model is selected and prepared with hyperparameters to train. The data is split into train and test splits randomly. The test split is fed into the ML model and the model performance is evaluated. The evaluated accuracy is again tested for increment with the change in hyperparameters and the final tuned model is validated with the test split. Finally, the ML model is trained with a reduced sample size of input data and the change in accuracy is monitored. This process is repeated with another ML model and the model performance is compared. The process is explained in detail in the following subsections.  Figure 3. Illustration of the ML metamodeling framework.

Non-linear Non-parametric Machine Learning Algorithms
Considering the abundant data available [12][13][14][15] to predict the firebrands and to avoid any strong assumptions about the mapping function form, the non-parametric ML model approach fits perfectly for the prediction modeling. The parametric ML models make strong linear or non-linear assumptions which works best on the data with a detectable linear or non-linear separation between the dependent variable labels. This research data does not have such patterns to use a parametric ML model for predictions and thus, any parametric ML model would either overfit or underfit the data depending on the hyperparameter we use to train the model. However, non-parametric models tend to create their own light assumptions which is flexible for a future change with the addition or removal of data points. This light assumption by the non-parametric ML model is often made after learning the patterns in the data with a tuned hyperparameter to avoid overfitting issues.
To further confirm the approach of selecting a non-parametric non-linear ML model over available ML models, we predicted firebrand data with parametric and non parametric ML models and compared their accuracies. We can observe the difference in model accuracies when compared with each other in Fig. 12. We chose the two ML models with highest accuracy and tuned the hyperparameters for the two chosen ML models to further increase the accuracy of the ML models and proceed with one of the two best prediction ML models. Among the available methods for this study, the non-parametric methods used in the ML prediction of firebrand data are KNNs and support vector machine non-linear (SVM non-linear) as described later in the Sects. 3.5 and 3.6.
The data is divided into a random training and testing splits. Training split is used for the ML model training and the testing split is used to validate the prediction efficiency of ML model after the model is trained.

Training and Testing Datasets
The purpose of splitting the dataset into testing and training datasets is to generate a pseudo unknown dataset out of the known dataset which is split randomly with the help of an algorithm. The training and testing datasets have equal number of data points. The training dataset has the independent predicting variables as: (1) Distance denotes the travelling distance bins of the landing firebrand generated in the experiments. A larger training data makes the prediction model better and accurate while more test data provides a better error estimate. This tradeoff between model accuracy and error estimation can be solved by splitting the data into half (50% training and 50% test data). The other way to get a better error estimate is to perform a cross validation. Thus to increase the model accuracy, we chose the training data split size as 70% and reduced the test data size to 30%.
All the predicting variables are independent to each other and are convenient to obtain in a WUI fire. The training data (70% of the total data points) is used to train the ML algorithm. This trained ML algorithm is used to perform prediction on the testing data (30% of the total data points). The test data, created after the random split of the dataset, is labeled. Thus, the ML prediction values on the test data and actual values of the landing firebrands will be compared and used to create a confusion matrix (accuracy evaluation matrix). This confusion matrix, shown in Fig. 4, helps in the evaluation of the ML model accuracy on a pseudo unknown data. The confusion matrix obtained on the testing data, from the ML model, will provide various other insights of its prediction efficiency, on an unknown dataset, in addition to the current ML model accuracy.

Model's Performance
The confusion matrix obtained with the help of ML model on test data is used to measure accuracy of the ML model (Fig. 4). The predicted high FAND or FAMD is denoted by High FAND or High FAMD (simplified as HF) and the predicted low FAND or FAMD is denoted by Low FAND or Low FAMD (simplified as LF) denoting the two types of target variable class in the data. For the performance measurement of the model, we will refer the confusion matrix (Fig. 4)  The value TP represents the total high FAND or FAMD (HF) correctly predicted, value TN represents the total low FAND or FAMD (LF) correctly predicted, value FP represents the total LF predicted as HF, and the value FN represents total HF predicted as LF. The FP and FN are the incorrect predictions which are also known as misclassifications. The range of the four values in the confusion matrix lies between 0 and the total number of classes present in the target variable of the data. The best-case scenario for the prediction model is having both FP and FN values as 0 for train data accuracy and test data accuracy which will indicate that the ML model predictions were 100% correct. The TP and TN values are supposed to be as high as possible with the max value equal to the number of the target variable classes. The four values in the confusion matrix are always positive.
The details of other performance parameters are provided in the following, while their values range from 0 to 100% and are listed in Tables 4 and 5.
Accuracy the ratio of correct predictions with respect to the total predictions performed by the model.
Precision the ratio of correct positive class predictions with respect to the total positive class predictions in the data.
Sensitivity (TPR) sensitivity, also known as recall and true positive rate (TPR), is the ratio of the correct positive values predicted and the total number of positive values in the data. For example total HF correctly classified as HF.
False Positive Rate (FPR) the ratio of negative class values predicted as positive class and the total negative class values.
Specificity (SP) specificity, also known as true negative rate (TNR), is calculated as a ratio of correct negative values predicted and the total negative values in the data. For this research SP is the ratio of correct LF predicted to the total LF values.
Prevalence prevalence shows how often the positive value (HF in our case) occurs in the sample data.
Positive Predictive Value (PPV) also known as precision, shows how often the predicted positive value is correct.
Negative Predictive Value (NPV) negative predictive value is the ratio of actual TN values and the total negative values predicted.
Balanced Accuracy the balanced accuracy is average of recall calculated in each class.
Applying Machine Learning for Firebrand Production Prediction F1 Score F1 Score is a harmonic mean of Sensitivity and Precision. Since F1 score accounts both false positive and false negative values, it helps in uneven class distribution.
Matthews Correlation Coefficient (MCC) Matthews Correlation Coefficient produce score with all the false positive, false negative, true positive, and true negative values. MCC is also known as a balanced measure. The value of MCC varies from À1 to þ1 where þ1 is considered as a perfect prediction, 0 as a random prediction, and À1 as a completely wrong predictions.
Confidence Interval (CI) shows the uncertainty of an estimate. The range of upper and lower accuracies of the prediction model shows that there is 95% percent chances of the accuracy of the model to lie between the upper and lower limits of the accuracy of the prediction model. For a sample size of n and z as the number of standard deviations from the Gaussian distribution, the CI is calculated as follows where z is 1.96 for 95% CI:

Model Tuning
The confusion matrix helps us to identify the efficiency of the ML model. If the desired false negative and false positive values (present in the confusion matrix) are achieved, the hyperparameter tuning of the same ML model is conducted. Hyperparameter tuning of a ML model allows us to choose the best value of the model parameters used in the ML algorithm to control the learning rate and prevent any under fitting or over fitting, which can lead to a drop in the accuracy of the model. If the false positive and false negative values are higher than the desired value, a different ML model is chosen and trained with the available training data. A hyperparameter, K-value, represents the number of nearest neighbors in the KNN ML model is needed to classify the data points. In Fig. 5, the optimum value of K is recognized when the error rate from the training data (solid line in the Fig. 5) gets close to the error rate from the test data (dashed line in Fig. 5). In Plot a of Fig. 5, the train error and test error are close at K-value 5. Thus, the hyperparameter K-value is 5 for the ML model predicting FAMD generated from the structural fuels. Similarly other ML models are tuned in Plots b, c, and d for the K-values.
After successful hyperparameter tuning, the sample size reduction of the model is conducted while maintaining a high F1 score of the prediction model. We analyzed the drop in error with respect to the increment in the training data's sample size and concluded that approximately 250 data points (vertical dashed line in Fig. 6) are required to train the ML model without compromising the model accuracy. Figure 6 shows that the ML models have error rate below 10% (horizontal dashed line) at 250 data points and above. This can enable future researchers to develop an efficient machine learning model with a high level of accuracy without wasting time, money, and hardwork to generate way more than required sample size of dataset to predict FAND and FAMD of the landing firebrands.

K-Nearest Neighbor (KNN)
The KNN ML model uses the Euclidean distance function as a base to measure the KNNs. Thus, the prediction can be performed with only four independent variables: distance, wind speed, component, and material for structural fuels; and distance, wind speed, species, and vegetation for vegetative fuels. This will increase the efficiency of the model and will require less data for prediction. The model uses random labeled points to label a new point. The label is decided based on the majority of the surrounding neighbors within the Euclidean distance. If p and q are the points, then the distance between p and q is calculated as, Figure 5. Hyperparameter (K-value) tuning for the KNN ML models: Plot a represents hyperparameter tuning of ML model predicting FAMD produced from structural fuels, Plot b represents hyperparameter tuning of ML model predicting FAMD produced from vegetative fuels, Plot c represents hyperparameter tuning of ML model predicting FAND produced from structural fuels, and Plot d represents hyperparameter tuning of ML model predicting FAND produced from vegetative fuels.
After training the models, accuracies on the testing data are derived from the confusion matrices (Table 6a, b, c, and d).

Support Vector Machine Radial Kernel
SVM requires little tuning while producing high performance [25]. Radial basis kernel (RBF) was used for the SVM ML model as the data showed non-linear patterns. The RBF uses Gaussian RBF where the distance between two points x i and center is calculated as, where jjx i À x j jj 2 is the Euclidean distance and c in Eq. 21 is, Figure 6. Sample size reduction for the optimum training size: FAND_Str represents the ML model predicting FAND produced from structural fuels, FAMD_Str represents the ML model predicting FAMD produced from structural fuels, FAND_Veg represents the ML model predicting FAND produced from vegetative fuels, and FAMD_Veg represents the ML model predicting FAMD produced from vegetative fuels. The red dashed line shows 10% error, black dashed line shows 5% error, and yellow dashed line shows 3% error (Color figure online).
The hyperparameter, c, decides the impact of the feature, /ðx i ; centerÞ, on the hyperplane. With increase in value of c, the boundary gets curvier. Thus, an extremely small value will make the model to work like a linear model (underfit) and a large value of c will make the model overfit. The optimum value of c, after the hyperparameter tuning of ML models predicting FAND and FAMD ranges from 0.04 to 0.23 to achieve the best possible F1 score (model's accuracy). Another hyperparameter tuned for the SVM model is COST (f) which is responsible for tolerance of misclassifications allowed through the hyperplane. The ALF represents Actual Low FAMD/FAND, AHF represents Actual High FAMD/FAND, PLF represents Predicted Low FAMD/FAND, and PHF represents Predicted High FAMD/FAND generated from structural/ vegetative fuels degree of misclassifications can be explained as the softness of hyperplane (n dimensional margin) separating the dependent variables. As the f increases, the separation boundary gets strict and penalize more for any misclassifications, which means the decision boundary is dependent on fewer support vectors. However, if the f is small, the separation boundary depends on more support vectors because the boundary gets softer (underfit) and for the f value 0, the data will not be separated at all. After the hyperparameter tuning of ML models the optimum value of f ranges from 4.1 to 8.0. The tuned hyperparameters, f and c, generate an F1 score ranging from 56.60% to 85.95% (Table 7a, b, c, and d).
Among the two top non-linear non-parametric ML models that we used, we chose KNN for a slightly higher accuracy when compared to the SVM ML model (Fig. 12). The KNN ML model performed with an accuracy greater than 90% ALF represents Actual Low FAMD/FAND, AHF represents Actual High FAMD/FAND, PLF represents Predicted Low FAMD/FAND, and PHF represents Predicted High FAMD/FAND generated from structural/ vegetative fuels while predicting FAND and FAMD generated from structural and vegetative fuels while SVM gave an accuracy below 90%.
The $ 7% misclassification in KNN model can be due to the various independent variables which were not included in training the ML model. The model's prediction accuracy is a result of using the independent variables which can be physically obtained within minutes, in a real WUI fire scenario, to give incident command more time to react and make better decisions accordingly. Another reason for the model's misclassification can be the limited full scale experimental data used to train the ML model. Conducting full scale experiments is expensive and requires time and hard work. To reduce misclassification, we can train the model with a larger amount of full scale experimental data. The reduction of misclassification rate depends on the purpose of the modeling. The prediction model can be for planning an experiment, wild land fire spread modeling prediction, and fire risk assessment involving firebrands, etc. Depending on a project's specific needs, the users can set the misclassification rate limit. We chose the three error limits (10%, 5%, and 3%) to demonstrate the process and framework but users may choose their own limits.

Area Under the Receiver Operating Characteristic Curve
The purpose of comparing the area under the receiver operating characteristic curve (AUROC) of the two ML models, KNN and SVM-NL, is to compare the performance measure at various thresholds. The AUROC will help us understand the model's measure of separability. The receiver operating characteristic curve (ROC curve) graph is plotted with FPR on x axis and TPR on y axis on different threshold values. The AUROC is then calculated using the composite trapezoidal rule along the x axis.
The KNN shows higher AUROC value when compared to the SVM-NL model. The AUROC ranges from 98.18% to 99.65% for KNN and 64.85% to 98.71% for SVM-NL. This shows that the measure of separability of KNN model is higher than SVM-NL for predicting the FAMD/FAND.

Cross Validation
The KNN model's performance on the test data shows that we expect the ML model to predict with at least 93.65% accuracy on an unknown dataset, generated from a fire similar to the fire created in our test facility. For a completely unknown dataset generated from a different environment (real WUI fire) it can give lower accuracy. In order to check the trained ML model's performance on such an unknown dataset, we performed K-fold cross validation on our existing dataset. In this study, the value of K was 10. Thus, the dataset was randomly shuffled and then split into 10 groups (K) of approximately equal size. One group was held as a validation dataset and the other nine groups (K À 1) were used to train the ML model to get the model accuracy [26]. The process was repeated for all the groups individually and then an average of all accuracies was used as the model accuracy.
The accuracies for 10-fold cross validation with KNN model on our dataset ranges from 63.62% to 84.04% (Table 9), which denotes the prediction accuracy of KNN model on an unknown dataset.

Results and Discussions
In the histograms (generated in the bivariate analysis of the impact of independent variable on the target classes), we have used percent FAMD and FAND values which are generated in the experiments, to uniformly compare and analyze the impact of different independent variables on the FAND and FAMD generation. The histograms with distance bins comparing high and low FAMD and FAND (FAND or FAMD are classified high for the values above the median value of FAND or FAMD in this research and low FAND or FAMD for the values below the median value) shows that there is a high chance of spot ignition for the shorter distance from the firebrands produced from both vegetative and structural fuels. Figure 8 represents one of such histograms with distance bins in meters. The histograms show that with the increase in distance, the chances of spot ignition is low. However, both structural and vegetative fuels can generate high FAMD at a distance up to 12 m (low probability of such occurrence).
The predicting variables that are analyzed in the data are wind, distance, and fuel characteristics. The wind has three categories as low, medium, and high wind speed. We had a hypothesis that there is a higher chance of high FAND and FAMD generation at high wind speed for both structural and vegetative fuels. In the bivariate analysis (Fig. 9), in high wind there are more chances of high FAND and FAMD generation except for the FAND generated from the vegetative fuels where medium speed wind has a higher proportion (1% higher) of high FAND  generation when compared to low FAND generation (for threshold at median). Thus, we reject the hypothesis that there are more chances of spot ignition due to firebrands in a high wind speed, when compared to low wind speed. These anomalies could be due to certain type of fuel characteristics which needs further study. The vegetative fuels have three vegetation types. The bivariate analysis of these vegetation types to see vegetation's impact on FAND and FAMD generation (Fig. 11) shows that trees tend to generate higher proportion of high FAMD generation when compared to shrub (18.4% higher) but shrub tend to have higher proportion of high FAND generation when compared to trees (0.4% higher). Thus, we reject the hypothesis that among the three vegetation types (given the same weight), tree creates high FAND and FAMD when compared to the other vegetation types.
There are specific fuel types which shows high probability of FAND and FAMD generation leading to higher risk of spot fire ignition. The construction materials Non-FRT Shake (Material 7 Fig. 10d) and Recycled Rubber (Material 6 Fig. 10c), which are used to construct roof (structural component 3 in the Fig. 10a, b), have higher chances of fire propagation through spot-fire ignition. Similarly, the vegetative species, Leyland Cypress (tree) and Saw Palmetto (shrub), have a high potential of fire spread through spot-ignition (Fig. 11a, b). The structural component with the highest risk of firebrands causing spot fire ignition is roof (Component 3 in Fig. 10a, b). Ratio of spot ignition caused by roof is 19.5% higher than corner and 45.5% higher than fence.
Conducting full-scale experiments in a wind tunnel to generate firebrands is expensive, time consuming and requires hard labor. The prediction model should use minimum input data to predict without getting underfitting issues. Thus, the minimum sample size of dataset, required to predict the areal density of firebrands, is measured by training the relevant non-linear non-parametric ML model (KNN) with a gradual reduction of training dataset while monitoring the rate of change of error.
The drop in accuracies, generated by the non-linear ML models on the gradually reducing test data, are monitored for the change in error rate through a graph as shown in Fig. 6. The graph shows that the KNN model can predict FAND and FAMD with an error rate below 10% with a minimum training size of 250 data points. As we increase the number of training data points above 250, the error remains approximately the same.
Thus, an optimum sample size to train a ML model predicting FAND or FAMD is suggested as 250 for each test setup. In future experiments, researcher may use 250 as a guide when determining the size of training data. This useful result can lead to cost saving and time reduction in future experiments.  Table 1 for the xaxis labels).

Conclusion
This study used the firebrand data produced from full-scale firebrand production experiments to train computer ML models to find and extract complex patterns from the data and use the extracted patterns to predict the FAND and FAMD for structural and vegetative fuels in different wind speeds with an accuracy above 90%.
The research data show that the firebrand data does not have linear boundaries and thus the non-linear non-parametric machine learning model, KNNs, works the best for prediction.
This research suggests the size of training data, generated from the full-scale experiments, as 250 per testing condition without dropping accuracy below 90%. This sample size reduction of the research data will save time and resources required in data generation through experiments (Fig. 6).
This research is a small step towards the bigger goal to achieve a numerical firebrand generation simulator. The research shows that ML models can predict discrete FAND and FAMD with the help of discrete independent variables with an accuracy higher than 90%. The proposed framework will help in the risk analysis of firebrand accumulation and ignition potential per fuel type. The current ML model can be used for the prediction of a possible spot-ignition in a WUI fire for the given fuel type and distance. This can be used to predict locations with high  Table 1 for the xaxis labels). risk of spotting ignition and help residents, firefighters, and other stakeholders to take appropriate actions.
There is a need to determine the threshold values of FAND or FAMD causing spot ignition per recipient fuel at a specific distance and wind speed in a WUI fire.
The ML model can be further improved with the help of additional full-scale experiments. The increment of accuracy of the prediction model can be achieved by using continuous independent variables instead of the current discrete independent variables such as wind speed and flying distance. The continuous distance can be achieved by using a camera capturing image to further analyze and get the distance between the firebrands and source. The continuous distance may also be achieved with the help of plastic sheet for the firebrand collection area in addition to the computer vision aid. The experiments can be conducted at various wind speed levels (e.g., at an increment of 1 m/s) to treat the wind speed as a pseudocontinuous independent variable. The firebrand images should be analyzed for the dense spots (high FAND). The number of high FAND centroids (unique firebrand's position), distance of high FAND centroid from the origin, and FAND in the pseudo-continuous wind speed (high resolution wind speed bins) for different fuel type combinations will help in further making the prediction model precise.