Identifying anthropogenic and natural causes of wildfires by maximum entropy method-based ignition susceptibility distribution models

Turkey has a high potential for wildfires along its Mediterranean coast because of its dense forest cover and mild climate. An average of 250 wildfires occurs every year with more than 10,000 hectares destroyed due to natural and human-related causes. The study area is sensitive to fires caused by lightning, stubble burning, discarded cigarette butts, electric arcing from power lines, deliberate fire setting, and traffic accidents. However, 52% of causes could not be identified due to intense wildfires occurring at the same time and insufficient equipment and personnel. Since wildfires destroy forest cover, ecosystems, biodiversity, and habitats, they should be spatially evaluated by separating them according to their causes, considering environmental, climatic, topographic and forest structure variables that trigger wildfires. In this study, wildfires caused by lightning, the burning of agriculture stubble, discarded cigarette butts and power lines were investigated in the provinces of Aydın, Muğla and Antalya, where 22% of Turkey’s wildfires occurred. The MaxEnt method was used to determine the spatial distribution of wildfires to identify risk zones for each cause. Wildfires were used as the species distribution and the probability of their occurrence estimated. Additionally, since the causes of many wildfires are unknown, determining the causes is important for fire prediction and prevention. The highest wildfire occurrence risks were 9.7% for stubble burning, 30.2% for lightning, 4.5% for power lines and 16.9% by discarded cigarette butts. In total, 1,266 of the 1,714 unknown wildfire causes were identified by the analysis of the cause-based risk zones and these were updated by including cause-assigned unknown wildfire locations for verification. As a result, the Area under the ROC Curve (AUC) values were increased for susceptibility maps.


Introduction
Forests are the main source of the biodiversity and habitat richness of a region, closely related to species distribution and ecosystem services (Demeke and Afework 2014;Suryabhagavan et al. 2016). Biodiversity and habitat are threatened by wildfires and these result in environmental transformation, land use change, deforestation, and climate change (FAO 2007). Declining precipitation, increasing seasonal temperatures and the effects of climate change make wildfires a recurrent problem world-wide and the risk of wildfires increases yearly. The causes of wildfires may be divided into natural and man-made sources. Although wildfires based on natural factors such as lightning, extreme temperatures, and spontaneous igniting of fuels cannot be predicted, human causes such as discarded cigarette butts, electric arcs on power lines, campfires, deliberate sabotage and stubble burning may be anticipated. The negative, irreversible effects of wildfires require identification of their spatial relationships for the generation of wildfire risk maps and prevention procedures (Tian et al. 2013), and this has become critical in the last decade (Miller and Ager 2013).
With the ability to combine quantitative and qualitative factors (Pomerol and Romero 2000), MCDA methods have been integrated into wildfire susceptibility generation mapping. Evaluating all the methods, machine learning-based maximum entropy (MaxEnt) and random forest algorithms are non-parametric methods widely used to estimate the spatial distribution of wildfires using spatial layers and presence locations (Hastie et al. 2009). The MaxEnt model is often referred to as a nonlinear regression model, and in wildfire occurrence mapping, historical wildfires are considered as species distribution to reveal the probability of occurring outside of historical wildfire locations (Parisien and Moritz 2009;Renard et al. 2012; Bar-Massada et al. 2012, 2013De Angelis et al. 2015;Vacchiano et al. 2018;Yago et al. 2019;Bekar et al. 2020;Banerjee 2021;Tariq et al. 2022). Random forest is also a non-parametric model that uses clustering techniques to predict the probability of wildfire occurrence (Archibald et al. 2009;Aldersley et al. 2011;Olivera et al. 2012;Arpaci et al. 2014;Rahmati et al. 2016).
When evaluating recent studies on wildfire susceptibility mapping, many criteria are used in the field of environmental, social, topographic, climatic, and meteorological parameters. Although studies have focused on specific relationships between topography and forests, climate and forests and artificial features (buildings, roads) and forests, most have examined relationships between variables. Considering the ones used in wildfire risk mapping, elevation, slope, aspect, vegetation type, temperature and distance to roads are the most frequently used (Table 1). Variables such as tree cover density distance to rivers and water sources, probability of soil wetness, and solar radiation are not widely used. Consideration of a limited number of factors leads to the generation of superficial risk maps in all methods used in wildfire susceptibility mapping, as wildfire probabilities require the assessment of comprehensive relationships of variables and their interaction with each other. Therefore, after an in-depth evaluation of the elements used in recent studies and the nature of wildfire occurrence, 25 were specified and used in this study to generate reliable and accurate wildfire susceptibility maps. In addition, the separate evaluation of stubble burning, lightning strikes, fires from discarded cigarette butts and power line-induced wildfire susceptibility maps make the results invaluable for wildfire occurrence rate assessment.
The aim of this study is to specify unknown wildfire causes by generating wildfire susceptibility maps based on fires originating from stubble burning, lightning, discarded cigarette butts and power lines using the MaxEnt model. The study should provide valuable information about the source of unknown wildfire causes. There were 3347 wildfire locations detected, and the cause of 1714 were unidentified. Considering the main causes of these 3,347 wildfires, 87.5% were due to stubble burning, lightning strikes, discarded cigarette butts and power line-induced ignitions. Other 12.5% of the causes are sabotage, camp fires and traffic accidents. Because 52.0% of the causes are unknown in the study area, specifying their origin will be useful for prevention procedures and further understanding the main causes of wildfires.

Study area
The study area is in Muğla, Aydin and Antalya provinces, an area of 40,947 km 2 located between 26°59′59" and 32°39′ 29" W and 36°5′31" and 38°6′53" N. It is at the intersection of the Aegean and Mediterranean seas and is important for tourism, pine honey production (90% of the world total) and the forest industry (Fig. 1). The population triples in the tourist season (June-September) compared to the other months. Both the increasing population and the fire-prone forests make the area very vulnerable to wildfires. Approximately 20% of wildfires occurred in Aydın (2.0%), Muğla (10.5%) and Antalya (9.5%) provinces. (URL 1 1 ). The climate of the study area is hot and temperate. Precipitation is mostly in the winter months, with relatively less in the summer. The area is mostly covered by coniferous forests between 100 and 2100 m. Identifying anthropogenic and natural causes of wildfires by maximum entropy method-based…

Methodology
MaxEnt is a machine learning, system-based algorithm used to determine the spatial distribution of species by evaluating sample point locations and dependent variables. It uses presence data to predict the possible distribution of species according to the maximum entropy theory (Phillips et al. 2006;Elith et al. 2011). The MaxEnt model  MaxEnt estimates the probability of an event's distribution by adjusting the possible distribution of the maximum entropy. In addition, the model identifies the most overlapping value of each presence point and spatial variables to determine the possible distribution of species in each variable, i.e., it adjusts the probability to the most uniform or diffuse distribution according to the independent variables at each presence point (Phillips et al. 2006). Therefore, the model requires presence data (existing wildfires) and independent variables (environmental, topographic, forest and meteorological criteria) to determine the distribution. Existing wildfires are considered as species to estimate their distribution in this study. MaxEnt software allows the user to generate the possible distribution through various parameters to be used for learning and graphical result outputs, and produces validation and accurate outputs based on AUC (Area Under the Curve) measurements and the jack-knife method which is a resampling technique useful for bias and variance estimation.

Variable selection
There are several spatial variables used to generate spatial distribution maps of wildfires caused by stubble burning, lightning, power lines and discarded cigarette butts. The variables were specified by considering the parameters that trigger and cause wildfires or indirectly affect wildfires in the subjects of topographic, meteorological, environmental, climatic and forest dynamics. A total of 25 variables were identified and mapped with ArcGIS 10.5 software. An example of elevation and forest type variable maps are provided (Fig. 2).
All variables were retrieved from global and regional data sources at the specific resolutions in Table 2 and mapped in the WGS84 projection, with 18,850 columns and 6740 rows (0.0031-pixel sizes in the WGS84 projection). All maps have been converted from GeoTiff to ASCII raster file for use in MaxEnt software.

Historical wildfire data
Data from 3347 wildfires that occurred in the provinces of Aydın, Muğla and Antalya between 2013 and 2020 were used (Fig. 3). Wildfire locations, extinguishing time, causes, duration, area burnt and other information such as intervention staff, vehicles and used equipment, were retrieved from the Republic of Turkey General Directorate of Forestry Ministry wildfires database. The main causes were burning of stubble, lightning strikes, power lines, and discarded cigarette butts. While there are other causes such as vehicles (traffic accidents), campfires, heat objects, terrorism, and traffic accidents, these are insufficient to reveal accurate and spatially dependent results, as they occur randomly and were excluded from the sample point dataset. All wildfires with the same causes were extracted as a separate file for use in MaxEnt software using ArcGIS 10.5 software.   Considering the distribution of wildfires each year, 29% are caused by lightning, 5% by discarded cigarette butts, 4% by power lines, 3% by stubble burning, and 7% by other causes (sabotage, traffic accidents, campfires, heat objects, explosions) However, when evaluating the wildfire statistics, the cause of 52% of the total wildfires could not be determined. Unfortunately, the fact that the cause of more than half of all forest fires is unknown creates a major deficiency in forest fire prevention and prediction. On the other hand, when the yearly distribution of wildfire causes is analyzed, there is a significant increase due to other causes. The cause of the decrease in unknown wildfires in 2019 and 2020 is likely because of the new observation systems of drones and watchtowers and the increased number of forest guards. Excluding unknown wildfires, most wildfires are caused by lightning. The causes of wildfires and the distribution of their total numbers are shown in Fig. 4.

Stubble burning
Although stubble burning is effectively banned by local governments, it remains the most common way to dispose of grain residues after harvest. Setting fire to straw stubble can easily ignite dry material and quickly destroy lands near forests. Since stubble burning takes place only in agricultural lands, the highest gain and relationship should be determined for the land use variable.
The stubble burning-induced occurrence probability map was generated by using 49 presences and 44 test points. The AUC values were 0.870 for sample points, 0.690 for test points and 0.794 if only test points were used. The presence and test points AUC values (0.870 and 0.794) revealed that selection of sample and test points was homogeneously and that the resulting distribution map was accurate (Fig. 5).
When the rates of contribution of the variables were evaluated, the highest was determined by the land use and road variables with 55.7% and 9.7%, as stubble burning occurs only in agricultural lands. Since stubble burning usually starts from the roadsides of agricultural lands for easier  control, a higher contribution rate was determined for the road variable. When the agricultural lands are examined, the majority are located between 500 − 1500 m elevation where the average forest density is below 50%. Therefore, stubble burning does not pose a threat to the dense forest cover in the study area. Contribution rates and AUC graphs are shown in Fig. 6.
Considering the overlapping results of wildfire locations with the land use variable, 38.3% overlapped with agricultural lands and 13.8% with transitional woodlands that form the border between agricultural lands and

Lightning
Lightning-induced wildfires occur more frequently than other wildfire causes in the study area. Between 2013 and 2020, 950 wildfires occurred due to lightning strikes. Higher elevations and dense forest covers appear to be highly prone to lightning strikes. As lightning is a natural event, it is difficult to predict and prevent compared to other causes, but it is one of the main triggering causes of wildfires. The spatial distribution of possible lightning-induced wildfire occurrence probability map was generated using 950 wildfire locations, of which 564 were assigned as presence points and 386 as test points. The AUC values were determined as 0.836 for sample points, 0.783 for test points and 0.815 if only test points were used. The AUC values showed that presence and test point selections were performed homogeneously and that the resulting probability map was accurate (Fig. 7).
When the contribution values of the variables were examined, BIO7 (22.3%), land use (16.7%), BIO3 (15.3%) and leaf type (13.8%) provided the most useful information about lightning-induced wildfire occurrence. Contribution rates and AUC graphs are given in Fig. 8.
The land use variable was highly correlated with lightning as expected, and 76.5% of the wildfire locations overlapped with the forest class. There were 54.7% of lightning-induced wildfires in coniferous forests and 23.2% in transitional forests. Wildfires occurred at varying elevations from 2 to 2030 m, with most above 800 m. Since BIO7 is an indicator of the annual temperature range and lightning occurs mostly in forest areas, leaf type has a decisive role on the ignition of wildfires when lightning occurs (Table 4).

Power lines
Contact by trees with high-voltage power lines can cause arcing (Mitchell 2013), and winds can move branches to power lines and create sparks. The electric arc can fall to the ground or directly ignite trees. In this case, trees near power lines are more at risk than others.
The spatial distribution of the power line-based wildfire susceptibility map (Fig. 9) was created using 132 wildfire locations, 90 of which were designated as presence points and 42 as test points. The AUC values were 0.904 for sample points, 0.765 for test points and 0.860 if only test points were used. In this study, the highest gain was in power lines with a value of 0.904.  While developing the power line-induced wildfire susceptibility map, the variables of roads (33.8%), power lines (12.3%), precipitation (8.5%) and land use (7.8%) play a decisive role in ignition by electric arcs. The road variable was highly correlated with arc-based ignitions, as power lines are often established along roads due to highly sloping topography. The AUC values graph and jackknife test results are shown in Fig. 10.
Most of the wildfires occurred at a distance of less than 500 m from roads; the average distance was calculated as  (Table 5).

Cigarette butts
Discarded cigarette butts are one of the most common causes of wildfires and are difficult to detect because they have no relationship to any environmental variable as cigarettes may be thrown from cars, roads, campsites and residential areas. For this reason, it is difficult to identify areas that are prone to wildfires caused by cigarette butts and that need to be examined or spatially determined, considering the land use variable. Although the reason is simple, prevention is difficult.
The spatial distribution of cigarette butt-induced wildfire susceptibility map was created using 166 wildfire locations, of which 100 were presence points and 66 test points. The AUC values were 0.868 for the sample points, 0.808 for the test points and 0.860 if only test points were used. The results reveal that most of the wildfire locations were clustered in dense forest cover near settlements in Aydın and Muğla provinces (Fig. 11).
BIO15 (23.1%), land use (9.7%) and wind (8.5%) play a decisive role in cigarette butt-induced wildfires. BIO15 (seasonal precipitation) has the highest contribution to the dampness of forest areas (Fig. 12). As this variable increases, the probability of cigarette butts igniting dry materials on the forest floor decreases. In addition, the land use variable made a high contribution as people generally throw cigarette butts on roads, camping areas and in settlements. Wind may be a trigger for the increase of wildfire occurrence.
The average wind speed was determined as 2.5 km/h, which is a higher wind speed than those for other causes. Therefore, it is possible that wind has an initial role in cigarette butt-induced wildfires. When assessing land use, 30.7% of wildfires overlapped with coniferous forests and 19.3% with agricultural lands and transitional forest cover. The road variable is 6.0% and represents the most prominent locations for spatially risky areas where forest roads are prone to ignition (Table 6).

Identifying unknown causes of wildfires
For determining the causes of unknown wildfires, spatial intersection analysis was used to evaluate the relationships between stubble burning, discarded cigarette butts, power lines and lightning-induced susceptibility maps and unknown wildfire locations. All locations were individually overlapped with the susceptibility maps and the risk value with the highest correlation was assigned as the related wildfire cause ( Table 7). The highest correlation values were typed bold in Table 7. In Table 7, there is a clear difference between the intersection ratios of the susceptibility maps. The difference values allowed the causes to be identified more clearly and  accurately. Lightning strikes, stubble burning, and power line-induced wildfire detections were more accurately determined than cigarette butts as these causes are highly correlated with the land use. For example, while stubble burning is carried out only on agricultural lands, power lines are established along roadsides and in forested areas. Since the discarding of cigarette butts is related to human activities, they can be thrown into any land use class such as roads, camping areas, residential areas, forested areas, and agricultural lands. Therefore, the assignment of cigarette buttinduced wildfire causes to unknown wildfire origin is less accurate. Considering the total intersection points of wildfire locations (the sum of the intersection rates of each risk map), rates of 73.1% for power lines, 65.3% for lightning, After cause assignment to unknown wildfires, 281 new forest fire points were added to stubble burning, 290 to lightning, 334 to power lines and 361 to cigarette butts. A total of 1266 unknown forest fire causes were identified, and for 448 forest fire locations, no relationship could be detected as was possibly due to sabotage, traffic accidents, heat objects and other causes such as terrorism. Forest fires caused by terrorism usually aim to destroy very large areas and are sometimes started at multiple points at the same time by using very flammable fuels to provide quick spread of fires.. The MaxEnt model was applied to all causes by involving new wildfire points to verify the similarity of the distribution. Spatial distribution maps of wildfires are provided for both before and after cause assignment to unknown wildfire in Fig. 13. The spatial distribution maps produced for all causes overlap and the distribution characteristics of the clusters and risk zones were similar.
Besides the similarity of both types of maps, the statistical values and variable contributions should be evaluated to determine whether cause assignment was accurate or not (Table 8). For all wildfire causes, training AUC values were increased by including unknown wildfire causes and test AUC values were decreased only for power line and cigarette butt causes. This increase is a valuable indicator for the accuracy of cause assignment and should be evaluated together with contribution rates of variables. When evaluating the contribution rates of variables, the order of variables was not changed. On the other hand, there was a substantial change in [contribution rates for 1st order variables where these changes are rather smaller than 1st orders.] Differences between 1st order variable contribution rates, land-use 11.9% for stubble burning, BIO7 14.7% for lightning, roads 10.1% for power lines and BIO15 1.9% for cigarette butts were determined. Because the order of variables was not changed and the differences between variable contribution rates are not large, spatial distribution maps of wildfires were similar.
In addition to the similarity of both types of maps, statistical values and variable contributions should also be evaluated to decide whether the reason assignment is done correctly. Comparisons of statistical values are summarized in Table 8. For all wildfire causes, training AUC values increased when unknown wildfire locations were included, and test AUC values decreased for power line and cigarette butt causes only. This increase is a valuable indicator for the accuracy of the cause assignment and should be evaluated together with the contribution rates of the variables. When the contribution rates are evaluated, it is seen that the order of the variables was not changed. On the other hand, it is seen that these changes mostly occurred in the 1st order variables compared to the other order variables. The highest contribution rate change was determined for the land use variable with 14.7% in stubble burning spatial distribution assessment.
The relationship between fire ignition and environmental causes has been analyzed using different models. Most studies used logistic regression-based models (Chuvieco et al. 2009;Martinez et al. 2009) and generalized linear models (Syphard et al. 2008;Martinez et al. 2009;Kwak et al. 2012;Vilar et al. 2016;Costafreda-Aumedes et al. 2017Li et al. 2019) and the random forest algorithm (Archibald et al. 2009;Aldersley et al. 2011;Olivera et al. 2012;Arpaci et al.  2014; Rahmati et al. 2016) due to the presence of nonlinear and nonparametric variables. Another approach is the machine learning algorithms of the MaxEnt method to discover the relationship between variables and ignition possibilities (Parisien and Moritz 2009;Bar-Massada et al. 2012, 2013Renard et al. 2012;De Angelis et al. 2015;Vacchiano et al. 2018;Yago et al. 2019;Bekar et al. 2020;Banerjee, 2021;Tariq et al. 2022). While most studies considered wildfires only as a site of occurrence, this study considered historical wildfires according to their anthropogenic and natural causes to provide a cause-level vulnerability analysis. The results show that the variables responsible for each are different and need to be evaluated separately to produce reliable, accurate wildfire susceptibility maps. The advantage of this study is that both anthropogenic and natural causes of forest fires are evaluated separately. This approach is used by Vacchiano et al. (2018) and the fires were studied by clustering them into summer and winter wildfires. However, in this study, mainly meteorological and bioclimatic variables were considered for mapping wildfire susceptibility using the MaxEnt method and maps of anthropogenically induced forest fire susceptibility were generated using only the variables of distance to roads and settlements. Settlements, roads, power lines, imperviousness, and land use variables were used to determine the relationship between wildfires and anthropogenic causes. One of the main differences, and the originality of our study, was to determine these causes in detail, i.e., discarded cigarette butts, power lines and stubble fires. Therefore, rather than creating general susceptibility maps, this study created susceptibility maps for the individual causes, resulting in more accurate and reliable maps for preparedness and prediction.
Both Vacchiano et al. (2018) and this study found that lightning is the most common natural cause of wildfires. In terms of lightning strike probability, land use and topography are the strongest predictors of ignition possibility (Krawchuk et al. 2006). In addition to land use, lightninginduced wildfire occurs when it reaches the forest floor where plants and dry soil initiate a surface fire (Wotton et al. 2003;Conedera et al. 2006). However, information about the forest floor and its degree of dryness are unknown and cannot be presented as a variable for mapping susceptibility. Instead, consideration of multiple variables also relates to this variable, as the degree of soil dryness depends on weather conditions, leaf type, and stand density (Schoennagel et al. 2004).

Conclusion
This study presents 3 main results in terms of including the highest number of variables in the literature for susceptibility mapping, producing susceptibility maps separately according to the causes of wildfires, and assigning the cause of forest fires whose cause cannot be determined. Susceptibility maps created can be used to determine prevention procedures, considering the main causes of forest fires. For example, lightning rods can be installed by considering the regions that are highly affected by lightning strikes, since more lightning-induced forest fires occur compared to other causes. Similarly, warning signs can be placed in areas where forest fires from discarded cigarette butts are most common to raise awareness. Because stubble burning takes place after the harvest season, control and early intervention methods such as directing fire trucks to highly sensitive areas and real-time monitoring systems using drones can be used. Since 25 variables were included in this study and numerous causes of forest fires were considered, the risk zones and areas susceptible to fires were not determined superficially. When the accuracy results of the test points are evaluated, it is seen that the generated susceptibility maps can be used for authorized personnel and institutes. In addition to these contributions, since 73.8% of unknown causes of forest fires were identified in this study, these determined the highest contribution to forest fire assessment studies.
There are also some limitations when generating MaxEntbased sabotage, terrorism, and traffic accident susceptibility maps. Numerous sabotage and terrorism-induced forest fires occurred in Turkey, especially in 2021. While some of these were caused by terrorism, many wildfires occurred as people deliberately started forest fires, called sabotage, to gain new farmland or construction sites. Since the study area has a high tourism value, new residential areas are needed. For this reason, forest fires caused by sabotage mostly occur on the coasts. Although generating susceptibility maps for these causes is important for the protection of forest cover, these causes have a heterogeneous distribution and are not related to any variable, since ignition can start from roads, forests, and urban areas in all seasons and in all climatic conditions. Therefore, sabotage and terrorism-induced forest fire susceptibility maps will not reflect the actual situation of sensitive areas, as they are not spatially related to any variable class. Similarly, since traffic accidents can occur on any roads in the study area, fires caused by these accidents may occur in any season and under any conditions. Therefore, because of this study, 26.2% of the total unknown forest fire causes are still undetermined.

Conflict of Interest
The authors declare that they have no conflict of interest. The authors declare that they have no known competing financial interests or personal relationships that could have influenced the research in this paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.