Application of the class-balancing strategies with bootstrapping for fitting logistic regression models for post-fire tree mortality in South Korea

We aimed to tackle a common problem in post-fire tree mortality where the number of trees that survived surpasses the number of dead trees. Here, we investigated the factors that affect Korean red pine (Pinus densiflora Siebold & Zucc.) tree mortality following fires and assessed the statistical effects of class-balancing methods when fitting logistic regression models for predicting tree mortality using empirical bootstrapping (B = 100,000). We found that Slope, Aspect, Height, and Crown Ratio potentially impacted tree mortality, whereas the bark scorch index (BSI) and diameter at breast height (DBH) significantly affected tree mortality when fitting a logistic regression with the original dataset. The same variables included in the fitted logistic regression model were observed using the class-balancing regimes. Unlike the imbalanced scenario, lower variabilities of the estimated parameters in the logistic models were found in balanced data. In addition, class-balancing scenarios increased the prediction capabilities, showing reduced root mean squared error (RMSE) and improved model accuracy. However, we observed various levels of effectiveness of the class-balancing scenarios on our post-fire tree mortality data. We still suggest a thorough investigation of the minority class, but class-balancing scenarios, especially oversampling strategies, are appropriate for developing parsimonious models to predict tree mortality following fires.


3 Introduction
Changes in global fire regimes (i.e., frequency, severity) have been detrimental, causing alternations in ecosystem services, such as carbon sequestration (Huang et al. 2009;Volkova et al. 2021), biodiversity (Calhoun et al. 2022), wildlife habitat (Van Lear and Harlow 2002), wood products (Rego et al. 2013), soil productivity (Sáenz de Miera et al. 2020;Hammond et al. 2021;Neary and Leonard 2021), and community protection (Alcasena et al. 2022). In South Korea, conifers are the dominant species, and the country experiences persistent seasonal and temporal droughts due to monsoons (Lee and Lee 2006), resulting in severe wildfires in the mountainous regions along the East Sea. Although most wildfires occur in the winter and spring seasons, when the climate is dry and cold with strong winds, fires now occur more frequently than in the past. It is a challenge for the fire suppression capacity to keep pace with the number of fires that take place due to uncharacteristic wildfire patterns.
Depending on the characteristics of the fire, such as intensity, frequency, and duration, affecting tree crown, stem, and root tissues, some trees are immediately killed, while others may perish over the next several years; otherwise, the trees may or may not recover even if bark char remains . The effects are classified into first-and second-order effects. First-order effects include the immediate heat transfer impacts of convection, radiation, and conduction (Michaletz and Johnson 2007;Bär et al. 2019). Second-order effects include physiological limitations related to carbon (C), water relations, and insect attacks following noncritical first-order effects (Michaletz and Johnson 2008). In addition, tree attributes, such as diameter at breast height (DBH), height, and crown ratio, and topographical characteristics (slope, elevation, and aspect) play an essential role in determining tree mortality following fires (Carmo et al. 2011;Watts et al. 2019;Hood 2020;Kwon et al. 2021). Quantifying fire damage is also crucial to determine post-fire tree mortality. Although there appear to be no solid criteria regarding fire severity, generally, it is assessed based on the interruption of the physiological process by fires. Several indicators representing fire severities have been introduced, including crown scorch and stem damage, such as bark char height, char depth, or direct sampling of cambium (Ganio and Progar 2017; Westlind and Kelsey 2019). Smith and Cluck (2011) documented what variables to use when marking trees that are likely to die following fires based on crown, cambium injury, and bark beetle activity for trees in California. Recently, Kwon et al. (2021) introduced a new metric for quantifying bark char, the bark scorched index (BSI), measuring bark scorched height (BSH) and bark scorched proportion (BSP) in South Korea. They approximated the damaged area to find significant differences in bark damage between live and dead trees. It proved to be a readyto-use index in the field, representing cambium status.
In the early post-fire restoration phase, it is necessary to quantify fire-killed trees for successful post-fire restoration because individuals respond differently to fire. Since tree mortality has binary responses (alive and dead), logistic regression is often used for modeling post-fire tree mortality. This approach is based on 1 3 Environmental and Ecological Statistics (2023) 30:575-598 the probability associated with a set of variables and is determined by the logit function as follows: where x 1 through x n are the explanatory variables and β 0 through β n are coefficients estimated from observed data using maximum likelihood (Shearman et al. 2019). Ryan and Reinhardt (1988) sought to predict the probability of tree mortality using logistic regression considering current fire behavior and effects in the U.S. This approach was later updated and is currently used for the First-Order Fire Effects Model (FOFEM), Fire and Fuels Extension to the Forest Vegetation Simulator (FFE-FVS), and BehavePlus, the most widely used post-fire tree mortality modeling tool in the U.S. Keyser et al. 2017). Furthermore, Westlind and Kelsey (2019) used logistic regression to determine the variables and their combinations most highly associated with post-fire bark beetle attacks across all diameter classes for ponderosa pine in Oregon and Washington. This assessment is useful for post-fire land management, such as salvage logging, which facilitates the selection and harvesting of trees most likely to die quickly so that the damage to new tree seedlings and other regenerated vegetation is minimized (McIver and Starr 2000).
Most existing classifiers, including the logistic regression model, are based on the assumption that classes given in the training dataset are evenly balanced. However, in tree mortality prediction problems, the class imbalance issue where the majority class surpasses the other (minority) class is expected, i.e., more trees are alive than dead trees. Despite its rareness, the minority class may carry important information (Krawczyk 2016). Imbalanced binary classes are problematic for machine learning algorithms because the probability that each class is selected will also be biased, resulting in a pattern of low sensitivity (proportion of dead trees correctly predicted to die) and high specificity (proportion of live trees correctly predicted to live) or vice versa (Shearman et al. 2019).
Pre-processing of training datasets has been suggested to adjust user preferences rather than applying a learning algorithm directly (Branco et al. 2016). Resampling strategies are an effective solution to overcome the imbalance problem (Estabrooks et al. 2004;Skryjomski and Krawczyk 2017). Shearman et al. (2019) applied logistic regression to examine tree mortality model performance by differentiating the proportion of the response classes in the sample by simulations and proved that increasing class imbalance had a crucial effect on model performance, showing low sensitivity and high specificity for low mortality rates and vice versa. Salas-Eljatib et al. (2018) used stochastic simulation (Monte Carlo simulation) to draw a random sample of the survival status (alive or dead) with a specific proportion of the response classes and predicted tree mortality following fires. In the imbalanced data scenario, all parameters in the model were biased, which increased as the proportion of either the alive or dead class deviated away from the middle (50-50). Additionally, there are more common approaches for proper classification by manipulating the training data distribution by adding examples to the minority (oversampling) or removing examples from the majority (undersampling) until the desired class Prob(Tree dead) = 1 1 + e −( 0 + 1 x 1 + 2 x 2 +…+ n x n + ) , 1 3 ratio is achieved (Seiffert et al. 2010;Blagus and Lusa 2015). The undersampling method is carried out by deleting the majority class in the training data, resulting in a loss of information but requiring less time, while no information is removed in the oversampling. However, duplicating a minority class in the training set may lead to overfitting and require training time and effort (Chawla et al. 2003;Drummond and Holte 2003;Seiffert et al. 2010;Branco et al. 2016). The Synthetic Minority Over-Sampling Technique (SMOTE), a combination of oversampling the minority while undersampling the majority, was suggested by Chawla et al. (2003) to overcome overfitting. Recently, Menardi and Torelli (2014) proposed a new method, Random Over Sampling Examples (ROSE), to alleviate the effects of the imbalanced setting using a smoothed bootstrap-based technique. The synthetic data are generated from a kernel density estimate with the same probability of selection between the majority and minority (Park and Jung 2019). Even if the aforementioned class-balancing approaches in logistic regression seem to be reasonable practices, few studies have addressed the importance of balanced data when fitting and validating post-fire tree mortality models following fires. According to the post-fire restoration process, post-fire regeneration is determined by many aspects (i.e., site condition, fire severity) through building consensus among stakeholder groups (Ryan and Hamin 2008;Ryu et al. 2017). In artificial regeneration after burning, it is important to address damaged trees on sites to replace them with new trees, especially for red pine stands in South Korea. Obviously, trees with severe crown damage start to wither and die relatively rapidly, while this is not always the case for trees bark char damaged by surface fires; they may or may not die, depending on the characteristics of the individual trees. Therefore, it is important to develop new techniques to predict tree mortality in situ quickly for individuals. Here, we built logistic regression models to determine the factors affecting post-fire tree mortality from surface fires. Then, we carried out resampling strategies for drawing balanced binary classes to explore the differences when fitting logistic regression models concerning different oversampling strategies compared to the original dataset through bootstrapping and provide the statistical properties of the model for each oversampling technique.

Site description and data collection
The dataset for the required analysis in this study was initially taken from Kwon et al. (2021) study and we collected extra two years of data.
Data for this study were collected from the mountainous area around Samcheok, Korea (37° 15′ 53″ N, 129° 1′ 24″ E) (Fig. 1). Samcheok is a region dominated by Korean red pine stands and Quercus sp. as an understory layer. The stands were established in 1972, and wildfires destroyed 765 ha in Dogye on May 6, 2017. We distinguished the fire severities using satellites and aerial photographs to find the regions appropriate for our study aims. Approximately 1.5 ha of the damaged area was selected as our study area. Most trees remain green due to surface fire (low severity), and it had a lower fire intensity; only 5 to 8% of the burned area was affected by the surface fire. The land in this region is composed of silt loam and silt clay loam soil with an average elevation of 700-800 m. It has a temperate climate, but especially as the speed of prevailing wind in the winter season increases, the air is dryer and colder, with 10% of the atmospheric humidity level, amplifying the probability of fire. The wind direction during the fire was mainly southwesterly, with a mean speed of 14.6 m/s (43.6 m/s at maximum). The mean annual temperature and total precipitation are 12.6 °C and 1280 mm, respectively. The fires weakened and were extinguished at 27% of the atmospheric humidity level. We established three study sites to collect information on whether a tree was dead or alive. All Korean red pine trees, which had intact crowns without any signs of heat damage and were at least taller than 8 m and larger than 15 cm at DBH, were tagged, mapped, and measured except for those standing out of 10 m buffer zones surrounding each site. At the beginning of the study, we recorded both tree (DBH, height, first branch height, and crown height) and topographical (elevation, slope, and azimuth) characteristics. DBH was measured by calipers, and height-related measurements were assessed by aluminum range poles. Elevation was measured with GPS (GPS Status & Toolbox, MobiWIA-EclipSim). Since azimuth is a continuous-circular variable that needs to be transformed, we divided it into two aspects: north and south. We also used aluminum range poles to measure BSH and BSP for each quadrant of the tree stems clockwise and then calculated the BSI as described by Kwon et al. (2021). Since trees were selected 2 months after the wildfires, any potential crown scorch was not assessed. For the following years, only tree mortality for each tree was recorded.

Original and simulated data analysis
Data were analyzed using R (R Core Team 2020). Unlike Kwon et al. (2021) study, there were unexpected salvage logging applied and anomalous events, such as windthrows and breakdowns, which resulted in missing tagged trees in 2020. Only 345 individual trees were selected, including 74 dead trees and 271 live trees. Due to the issue created by the sample size, we chose resampling strategies with oversampling: random oversampling (OS), ROSE, and SMOTE. We used the installed packages called ROSE (for OS and ROSE) and SMOTE for each sampling scheme. We applied a 60:40 training set to the test set split ratio, and empirical (nonparametric) bootstrapping, which is one of the tools frequently used to estimate measures of uncertainty in parameters associated with a given statistical method, was applied to both original (class-imbalanced) and class-balanced datasets to examine the effect of class imbalance. For each bootstrap sample, a multiple logistic regression was developed to predict the probability of post-fire tree mortality by maximum likelihood estimation (MLE) using the glm function in R. To make the sampling error marginally small, B = 100,000 repetitions were conducted, in agreement with the number of simulations used in the study by Salas-Eljatib et al. (2018). Six predictors (Slope, DBH, Height, BSI, Aspect, and CrownRatio), were determined to have significant effects on tree mortality based on the construction of univariate models with each variable in the preliminary study and were used to build models. Two models were developed: (1) a full model that included all variables, and (2) a nested model created based on Akaike's information criterion (AIC) value. To determine the significance of the bootstrap coefficient, we calculated the bootstrap confidence interval (C.I). Statistical properties for each parameter in the models were assessed using bootstrap bias, variance, and root mean square error (RMSE) as follows: where V ar B ( ) is the sample variance of * (1) , * (2) , … , * (B) as an estimate of the variance of sample parameter and * B is the mean of the bootstrap samples. In addition, we evaluated model performance using five assessment metrics [i.e., sensitivity, specificity, B-accuracy, geometric mean (GM), and Youden's index (YI)] for the classification coupled with class-balanced datasets. Given 100,000 coefficient pairs derived by iterations, we calculated each index using the formulas below and averaged them: We set a cutoff value of 0.50 to diagnose tree mortality for models, and all hypotheses were examined at an α level of 0.05.

Original data: factors influencing the probability of post-fire tree mortality
We examined the logistic regression model parameters fitted by the original dataset (Table 1). While Slope, Aspect, Height, and CrownRatio could impact tree mortality, we found that BSI and DBH were significant in both full-and nested models. Even though slope showed no significant impact on tree mortality in the nested model, the model included was the best model in terms of the AIC value, resulting in four parameters being included in the total. We found that trees with higher BSI or smaller DBH, while holding other variables constant, were more likely to die. According to the standardized coefficients, we found the most substantial effect of BSI on post-fire tree mortality in the full model, followed by DBH, Slope. In contrast, the DBH impact was the most significant in the nested model. In addition, there seemed to be no obvious difference in terms of AIC between the full-and nested models, so we compared these two models statistically to see if the nested model would exhibit better performance. We found no significant difference in model performance estimating tree mortality between the two models, so we favor the simpler model because it can be easily interpreted with similar power.

Simulated data: effects of class-balancing
Based on the bootstrap confidence interval for both models, the significance of the variables in the model under class-balancing scenarios was similar to those included in the model fitted by the original dataset (Table 2). In addition, we found that the Aspect in OS, ROSE, and SMOTE was significant in the full model but insignificant in the original dataset, and even the Slope in OS was significant in both models. Table 4 shows the standardized parameter estimates of the logistic regression model, discerning fire-induced tree mortality mostly as BSI regardless of class-balancing scenarios, followed by DBH except in the ROSE scenario, where CrownRatio was the second most important factor influencing tree mortality in the ROSE scenario. Figures 2 and 3 represent the empirical distribution of the estimated parameters for the Korean red pine tree mortality model following fires developed by the bootstrap samples for the full-and nested models. We observed that imbalanced binary classes affected the distribution of bootstrap regression coefficients, and even within the same variable, different distributions were observed depending on class-balancing regimes. Higher variances of the estimated parameters for both models were found in the original bootstrapping dataset compared to those from class-balanced data in both models. One interesting observation was that Aspect in the full model showed a bimodal distribution from bootstrapping with imbalanced data. Although classbalancing approaches improved this problem, a small peak in SMOTE was still observed. We found that lower variability and bias were observed in class-balancing scenarios in both models; however, the mean of Aspect in the full model and BSI included in both models still deviated from those in actual parameters for each scenario. With the assistance of Tables 6 and 7, we found that the Intercept in ROSE and the Slope in ROSE and SMOTE were more biased than those fitted by the original model. Additionally, we found that the variables included in the nested model showed lower variability and bias than those in the full model. Figure 4 shows how much the accuracy of the estimated parameters in each model was affected by data-balancing methods. For the full model, we found a noticeable improvement when using balanced data relative to the imbalanced data despite the different degrees of effectiveness in balancing methods for each coefficient. Additionally, regardless of the data-balancing method, Slope, DBH, Height, and BSI included in the full model had small variance and bias, while Intercept, Aspect, and CrownRatio had high variance and bias for the full model, supporting the interpretation of Figs. 2 and 3. In particular, Aspect was sensitive when 1 3 bootstrapping, showing high variance and bias, resulting in the highest RMSE; however, the data-balancing process effectively reduced variance and bias compared to original bootstrapping. Similarly, all variables in the nested model, except for Intercept, had relatively lower variance and bias than those in the full model. Nevertheless, in some cases, the coefficients developed from the class-balancing process were more biased than those from the imbalanced dataset. However, their variances were  6).
In addition, we evaluated model performance with different data-balancing strategies for the full-and nested model (Fig. 5, Table 7). The bootstrap dataset with an imbalanced class had a low sensitivity (0.3571) and high specificity (0.9636) when using the full model. The class-balancing process drastically increased sensitivity (up to 0.9286) but reduced specificity (0.7697 on average), of which SMOTE showed the highest specificity among class-balanced schemes, followed by OS and ROSE. We found that the B-accuracy, GM, and YI from the imbalanced dataset were smallest among the sampling scenarios, and those in SMOTE were highest in class-balancing scenarios since those metrics are a function of sensitivity and specificity. That is, better classification results will be obtained as the discrepancy of each class in the training set decreases. The same trend appeared in the nested model, but higher specificity, B-accuracy, GM, and YI were obtained relative to the full model.

Investigation of factors affecting tree mortality
It is essential to develop a more systematic approach for discerning tree mortality to support rapid restoration (thinning and regeneration) based on the plans, resulting in a reduction in the time and cost spending for thinning and planting. We found that BSI and DBH significantly affected post-fire tree mortality for Korean red pine trees. DBH size is closely related to bark thickness, and thin bark provides little insulation, resulting in deformation of the xylem and phloem (Watts et al. 2019;Hood 2020). Lee and An (2009) suggested that the high mortality of Korean red pines after wildfires is caused by their thin bark. In our study, as DBH increased, the probability of fire-induced tree mortality decreased due to increased tree resistance to fire. We observed that dead trees with DBH larger than 35 cm accounted for approximately 2% of the total trees in our study sites. Although many species have thick bark protecting the living tissues of the cambium and phloem from fire heat, Hood et al. (2010) did not find a significant effect of DBH on tree mortality in conifers in California, suggesting that tree size may not be a suitable proxy for tree resistance to fire for all trees. Hood and Lutes (2017) examined post-fire tree mortality among conifers in the western U.S. using FOFEM, and the probability of mortality increased as DBH size increased for white fir (Abies concolor). This suggests that DBH does not represent bark properties, such as fissures, moisture content, and density (Dickinson and Johnson 2001), or reductions in tree vigor due to aging and beetle attack (Hood 2010). Additionally, Bova and Dickinson (2005) showed a stronger association between integrated heat flux and bark thickness than DBH in Acer rubrum and Quercus prinus forests. Measuring DBH could be used to represent resistance to fire, but it is recommended to investigate the relationship between bark thickness and fireinduced tree mortality.
Additionally, quantifying fire damage is crucial for determining post-fire tree mortality. Our study sites experienced only surface fires, assuming no crown damage, only bark injuries. We successfully evaluated fire damage on the bole using BSI, which was revealed as the most influential attribute in both models (Table 4). However, BSI did not fully explain fire severity, exhibiting high variability in the original bootstrap distribution for both models (Figs. 2, 3). Ryan (1982) suggested that since fire duration and intensity vary around a bole with different shapes, BSI may represent superficial damage by surface fires. Ryan (1982) suggested stratifying the char depth on the bole near the base to classify the internal damage with the following classifications: none, superficial, moderate, and deep. Damage to cambium tissue accounts for physical and physiological changes, such as color, smell, density, and plasticity (Gao and Cha 2009). Hood and Bentz (2007) used the cambium-kill rating (CKR), a sum of dead cambium samples per tree (0-4), as a fire-injury variable, and a greater CKR was detected in dead trees than in live trees. Therefore, diagnosis with visual (outer) and elaborate (inner) inspections yield noteworthy results, explaining fire intensity and persistence with low variability.
We noticed that trees on the steeper slopes and with northern aspects were potentially dead. At the beginning of the study, we tested the effectiveness of azimuth for predicting tree mortality. However, it had little interpretation power beyond factorizing the directions. Therefore, we divided azimuth into two directions: north and south. We found that fire-induced tree mortality was more likely to occur in the northern aspect than in the southern aspect, even though the difference was nonsignificant. Kwon et al. (2021) suggested that since strong winds blew from the southwestern direction, more trees were prone to die in the northern aspects, and DBH was significantly larger in the southern site than in the other two sites. This pattern may be caused by the fact that the fire was less intense at the site with a southern aspect, leading to the small number of injured trees. In addition, Aspect is a standlevel variable rather than a tree-level variable so that it might be confounded. Thus, more replications are needed to investigate the predictive power of these aspects. Given more data on multiple fires, Aspect might be useful for describing post-fire tree mortality in wind properties.
In our post-fire tree mortality study, we examined individual dead tree characteristics, yet some limitations exist. As we mentioned, large trees with low BSIs had a lower probability of dying from fires. We hypothesized that the wider the DBH was, the greater the contact area affected by fires, resulting in a higher BSI. Likewise, we found a relationship between tree mortality and Aspect; however, smaller trees tended to grow in the southern aspects of our study site. These confounding variables may obfuscate the interpretation power. We tested whether any interactions existed between the variables above, but no significant interaction was found. This outcome may be due to the number of replications, so more replications are recommended to increase the power of our study. In addition, we noted that some trees were harvested in 2020, leading to a reduction in sample size compared to Kwon's et al. (2021) study. However, we checked to be sure that there was no significant effect of harvesting on stand attributes or the impact of each predictor on post-fire tree mortality in previous study. We would say although there might be potential confounds of the intermediate salvage logging, affecting sample size, there was no significant effect of harvesting other than that.

Assessment of the fitted model coupled with class-balancing strategies
As expected, the significance of the variables in the bootstrap regression of data from the balanced class approach mostly coincided with that of the fitted model created with data from the original dataset (Table 2). Regardless of which balancing approach we chose, we found that all the properties in the nested model outperformed those in the full model, but we focused on the full model to better explain all the variables for which we collected data. We found that the prediction capabilities of our models were improved when using class-balanced data, showing reduced error rates (Fig. 4). We highlighted the bimodal distributions of Aspect in the full model, which did not show convergence, although we had B = 100,000 iterations (Fig. 2). This situation was most notable in SMOTE among the balancing methods, but it is relatively trivial compared to the results for the imbalanced dataset. We 1 3 used contingency tables to assess the changes in discrepancies between mortality and Aspect according to the class-balancing methods. In our study, the majority of trees facing south are located at site 3, and only one tree that faced south was dead in the original training dataset, which may have resulted in divergence of the bootstrapping results. On the other hand, the balanced datasets included more dead trees in the southern aspect, followed by OS, ROSE, and SMOTE (Fig. 2, column 6). Additionally, this clarifies why the significance of Aspect was reliable in the balanced scenarios, showing proper bootstrap distributions in the full model. King and Zeng (2001) suggested that adding cases (occurrences, coded as one) into the data would decrease the variance in the MLE parameters since the model obtained from the datasets was fitted by maximum likelihood. That is, the estimated parameters are the most likely to reflect the data (Schabenberger and Pierce 2001). In addition, Salas-Eljatib et al. (2018) illustrated the effect of class imbalance using empirical bootstrap regression by differentiating the proportion of non-occurrences (coded as zero) and revealed the predictive capabilities of each coefficient in the fitted model. They noted that the variance in the MLE parameters decreased as the class discrepancy decreased, resulting in the lowest RMSE. However, we found some properties in the specific class-balancing methods that underperformed relative to BSI in the nested model; higher RMSE values of BSI and CrownRatio were observed in SMOTE in the full model than from the original bootstrapping process, leading to higher bias and variance. The higher variance and bias in the SMOTE approach might have been caused by how SMOTE generates the data. Unlike other resampling methods, it allows artificially generating a new sample by adequately combining it, rather than duplicating existing information. Here, we analyzed the dataset across the sites to amplify the sample size without considering random effects. The selected samples used for the SMOTE algorithms might provide unique information, causing noise while bootstrapping.
Due to the large number of live trees in the sites, the model fitted with an imbalanced training set seriously underestimated the number of dead trees, exhibiting low sensitivity and high specificity, and tending to predict dead trees to be alive (type II error). We heavily weighted zeros in the imbalanced dataset, expecting high TN and low TP in the confusion matrix, while the balanced ones increased the number of ones in the dataset. As a result, the model had increased sensitivity and marginally less specificity, resulting in decent B-accuracy, GM, and YI. Shearman et al. (2019) found that as the differences in the number of live and dead trees increased, the gap between sensitivity and specificity increased. This suggests that balancing data is appropriate for predicting post-fire tree mortality. However, in binary classification problems, the fact that some metrics, calculated by the elements in a confusion matrix, change as data distributions change should be acknowledged. Tharwat (2021) revealed that accuracy, F-measure, and precision, common metrics for evaluating classification performance, were more sensitive to imbalanced data. Therefore, it is necessary to choose insensitive metrics regardless of the data distribution. We found an increase in FP and a decrease in FN when we applied classbalancing methods, resulting in reduced precision in class-balancing schemes due to the increased number of FPs even though a reasonable F-measure was achieved (not included).

Investigation of levels of effectiveness of class-balancing regimes
We assessed the effectiveness of class-balancing strategies when predicting fireinduced tree mortality through bootstrapping compared to using original data. To compare the effectiveness of the class-balancing scenarios, we evaluated the statistical properties and model performance among the class-balancing methods. We found that all statistical properties of MLE parameters in the fitted model were affected by balancing approaches. All variables except intercept and height in SMOTE showed the highest RMSE and the lowest in OS in the full model. Although the statistical properties of each parameter developed by SMOTE seem inadequate, the model with SMOTE showed the best model performance (Fig. 5, Table 7). The performance of OS was not good as that of SMOTE, but still fair. Those trends were inconsistent in the nested model compared to the full model. The instability of statistical properties and model performance among the class-balanced data may derive from the minority class distribution in the original data. When handling imbalanced classes, minority classes formed homogenous regions composed of one class solely (safe) or otherwise (unsafe). Unsafe is subdivided into borderline, outlier, and rare, which are more likely misclassified (Napierala and Stefanowski 2016;Skryjomski and Krawczyk 2017). We checked the minority class in our training set using multidimensional scaling (MDS) to visualize projections into newly created dimensions. We found that most minorities were in the borderline region in the original dataset. For the OS and ROSE datasets, we found more scattered patterns in minorities, but those in ROSE organized more homogeneous groups than OS, although the difference was small (Fig. 6). In the SMOTE dataset, we discern the different minority patterns; approximately 60% of homogeneous regions of the minority class (safe), and the rest of which was composed of borderline and outliers. We observed a somewhat consistent pattern corresponding to our model performance result. However, Liu (2022) noted that SMOTE does not consider any nearest minority samples, resulting in overlapping between classes. Thus, it eventually includes noisy minority samples in the area belonging to the majority class. Han et al. (2005) proposed an advanced SMOTE, namely, Borderline-SMOTE, where only minority classes spanning the majority class (borderline) in the training set are oversampled. They compared the conventional SMOTE, OS, to the newly introduced SMOTE and found that the new method more easily classified observations as the minority class on the borderline. Even Skryjomski and Krawczyk (2017) hypothesized that the SMOTE performances would exhibit the following trend: the more balanced the original dataset was, the better performances would be observed. However, they found that class ratio alone cannot always explain learning performance. Rather, they proposed assessing the local characteristics of minority classes while dealing with imbalanced data. Krawczyk (2016) also suggested distinguishing between minority and noisy samples when conducting imbalanced regression, emphasizing a deeper investigation of what makes samples noisy and which minority samples contain valuable information.
Class-balancing is a process to convert a dataset with class-imbalance to one with similar or equal proportions. It has been well justified in many fields, but few studies have explained the importance of class-balancing techniques relevant to tree 1 3 mortality after fires (Salas-Eljatib et al. 2018;Shearman et al. 2019). Furthermore, few studies have investigated class-balancing scenarios in ecology. We used not only the OS but also two different advanced oversampling methods based on the specific algorithm, ROSE and SMOTE, which are known as strategies that overcome the issue of overfitting common in OS. The results showed that after the oversampling strategies, each method significantly impacted the statistical properties of each coefficient included in both models. Another class-rebalancing technique, undersampling, is a prevalent method where the number of samples in the majority class is reduced to the same number as the minority class (More 2016). However, due to the reduction in the size of the majority class in the training data, critical information may be lost, and this becomes more problematic as the discrepancy between the two classes increases. The undersampling performance in our study was conducted (results were not included), but we observed eccentric error terms (RMSE, bias, and variance). Even though the imbalance ratio of our training data sampled from the original dataset was approximately 4:1, meaning the ratio was not as severe as those in other imbalance class studies, the sampled minority class was too small (less than 50). We had 100 samples on the outside, which may result in high variabilities. Given enough sample size to use undersampling, confirming the reliability of the class-balancing strategies is possible.
In this study, we considered modeling strategies for imbalanced domains, focusing on data pre-processing rather than the algorithm level, and demonstrated the effectiveness of using class-balanced data when fitting the logistic regression model using bootstrapping. Since we had no information about the population distribution of tree mortality, empirical (nonparametric) bootstrapping was used. Here, we avoided finding the best data-resampling approach since the natural characteristics of the datasets also affect the effectiveness of learning algorithms with different resampling techniques (Chakravarthy et al. 2019). Rather, we used the bootstrap technique to capture the accuracy of estimators ( θ ) in the fitted model corresponding to the different class-balancing schemes. In addition to bootstrapping, many studies have revealed the effect of rebalancing classes with several other machine learning (ML) algorithms. Research on dealing with a class-imbalance problem for the post-fire tree mortality dataset in northern Florida, USA, was conducted by comparing logistic regression and random forest (RF) effects on the different proportions of class imbalance (Shearman et al. 2019). They emphasized the importance of the class-balanced data when using RF models because they are also based on a bootstrap sample of the training data. That is, establishing an equal probability of drawing a bootstrap sample is a priority before using machine learning techniques (Chen et al. 2005). They demonstrated the limitation of using a logistic regression model solely with the imbalanced class dataset and highlighted the effectiveness of the RF model. Demir and Şahin (2022) examined oversampling effectiveness (OS, ROSE, and SMOTE) in conjunction with several learning algorithms, such as Naїve Bayes, RF, and support vector machine (SVM). They found a specific combination that derives synergetic effects between sampling strategies and learning algorithms, among which SVM with SMOTE method was a distinguished model, showing the highest accuracy. In our study, the model performance indices in the SMOTE for both models, including sensitivity, specificity, accuracy, and B-accuracy, seemed reasonable, even though some bootstrap coefficients in the full model were greater than those in RMSE in SMOTE. Chakravarthy et al. (2019) provided detailed insight into the ROSE and SMOTE samplings applied to different domains with some learning algorithms and revealed a similar result to that described above in that the resampling techniques responded differently to the classifiers and even data types. They suggested that more work should be done to develop an original composite measure for evaluating the resampling methods.

Conclusions
After surface fires, there are often more trees alive than dead trees; the dead trees often represent the more interesting class despite its rareness. We applied class-balancing regimes to the fitted logistic regression models and examined their effectiveness in terms of statistical properties and model performance. We found that trees with higher BSI or smaller DBH have a greater likelihood of mortality, and the significance of the variables depended on the sampling strategies. Irrespective of the balancing method, the most critical determinant was BSI, followed by DBH. Furthermore, the class-balancing methods improved both the statistical properties of the bootstrapping parameters and model performance, balancing sensitivity and specificity. However, the effectiveness of data-rebalancing schemes differed due to their distinct algorithms to balance the class using an original sample with high variability. We still suggest a deeper assessment of whether a specific minority class contains valuable information or becomes noisy, yet our results shed light on the fact that oversampling can be an appropriate alternative for addressing the class-imbalance problem in tree mortality prediction following fires.