Logistic regression versus XGBoost for detecting burned areas using satellite images

Militino, A. F.; Goyena, H.; Pérez-Goya, U.; Ugarte, M. D.

doi:10.1007/s10651-023-00590-7

Logistic regression versus XGBoost for detecting burned areas using satellite images

Open access
Published: 20 January 2024

Volume 31, pages 57–77, (2024)
Cite this article

Download PDF

You have full access to this open access article

Environmental and Ecological Statistics Aims and scope Submit manuscript

Logistic regression versus XGBoost for detecting burned areas using satellite images

Download PDF

A. F. Militino^1,2^na1,
H. Goyena^1,2^na1,
U. Pérez-Goya^1,2^na1 &
…
M. D. Ugarte^1,2^na1

1076 Accesses
Explore all metrics

Abstract

Classical statistical methods prove advantageous for small datasets, whereas machine learning algorithms can excel with larger datasets. Our paper challenges this conventional wisdom by addressing a highly significant problem: the identification of burned areas through satellite imagery, that is a clear example of imbalanced data. The methods are illustrated in the North-Central Portugal and the North-West of Spain in October 2017 within a multi-temporal setting of satellite imagery. Daily satellite images are taken from Moderate Resolution Imaging Spectroradiometer (MODIS) products. Our analysis shows that a classical Logistic regression (LR) model competes on par, if not surpasses, a widely employed machine learning algorithm called the extreme gradient boosting algorithm (XGBoost) within this particular domain.

A comparative evaluation of state-of-the-art ensemble learning algorithms for land cover classification using WorldView-2, Sentinel-2 and ROSIS imagery

Article 06 May 2022

On the ensemble of multiscale object-based classifiers for aerial images: a comparative study

Article 19 May 2018

Relative performance evaluation of machine learning algorithms for land use classification using multispectral moderate resolution data

Article Open access 01 October 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Different boosting methods and logistic regression (LR) models have been jointly analyzed since more than 20 years ago from a methodological perspective, obtaining similar performances (Friedman et al. 2000). One reason could be that boosting can be interpreted as an approximation to additive modeling with a logistic scale using maximum (Bernoulli) likelihood (Friedman 2001). More recently, the comparisons have been extended to many versions of boosting in the context of different models and different health (Ingwersen et al. 2023; de Menezes et al. 2017) or environmental applications (Arabameri et al. 2019; Rizeei et al. 2019). These studies reaffirm the proximity between LR and boosting algorithms, but with some differences depending on the boosting versions, modelling and data. Frequently, it is assumed that machine learning methods overcome traditional statistical procedures, in particular when dealing with large datasets. In this paper we assess the performance of both methods for detecting burned areas using satellite images. It is worth mentioning that the dataset used in this context is characterized by severe class imbalance and a large volume of data.

Satellite images are crucial sources of information for monitoring wildfires on the Earth surface and specifically, the generation of global images of Burned Areas (BA) has been an important issue since the late 1990s. Consequently, since the 2000s, it is possible to find periodical and global BA products routinely derived worldwide, but with limited level of accuracy. An excellent review of developments in detection of burned areas with remote sensing data is provided by Chuvieco et al. (2019).

The most popular Burned Area product is MCD64A1 of the Moderate Resolution Imaging Spectroradiometer (MODIS) (Tomshin and Solovyev 2021) mission, because it provides a monthly global gridded 500 m image of burned areas and quality information (Giglio et al. 2018) for all over the Earth. The algorithm used for creating this product lies in a burn sensitive vegetation index (VI) to create dynamic thresholds, however, some disturbances caused by clouds, atmospheric absorption and sensor-introduced noises can still be present (EEDC 2022), and erroneous identifications of burned areas are possible. For example, in the European Mediterranean countries, MCD64A1 shows a 25% of BA overestimation with regard to the European forest fire information system (EFFIS) (Turco et al. 2019a). FireCCI51 is another global BA product (Lizundia-Loiola et al. 2020) available from 2001 to 2020 based on MODIS 250 m reflectance product, but recent studies show little improvement of this product with regard to MCD64A1 (Hall et al. 2021; Vetrita et al. 2021). There are also specific contributions for detecting burned areas with Landsat and Sentinel missions, but regrettably they are only available in specific regions.

Overall, satellite BA products are based on a great variety of algorithms with different efficiencies depending on image resolutions and ecosystem variety, yet recently, the fast evolution of machine learning techniques has enabled to improve the detection of burned areas (Jain et al. 2020), and the study of wildfire spread patterns (Khanmohammadi et al. 2022). Known methods such as random forest (Ramo and Chuvieco 2017; Belgiu and Drăguţ 2016), support vector machines (Zhang et al. 2015; Petropoulos et al. 2011), artificial neural networks (Mas and Flores 2008), convolutional neural networks with long short term memory (LSTM) (Pinto et al. 2020) and geometric semantic genetic programming (Castelli et al. 2015) are available methods for detecting burned areas, yet gradient boosting based models are widely used. LR has also been used for classifying burned areas since more than two decades ago (Koutsias and Karteris 2000; Bastarrika et al. 2011), but only recently, machine learning techniques have become more popular, mainly due to the need of managing big datasets. Comparisons between statistical and machine learning approaches are scarce, but in some cases many similarities are observed in the predictive performances (Ramampiandra et al. 2023).

In this paper, we evaluate the effectiveness of burned area detection in a region spanning over 100,000 km$^2$ on the Iberian Peninsula. We utilize remote sensing data and compare the performance of two classifiers with distinct approaches: a traditional statistical classifier, LR, and a machine learning-based classifier, the extreme gradient boosting algorithm (XGBoost). The application presented in the paper starts with a detailed description of the procedure: (a) we shortly describe the MODIS products, (b) we explain the definition of the auxiliary variables and the reference classification variable, and (c) we compare both classifiers for the identification of burned and non-burned pixels. We know that the presence of highly imbalanced data, specifically burn and unburned pixels, can significantly complicate the estimation process, but facing both procedures using the same data can give a fair evaluation. The validation is made by comparing the predicted classification with the target reference and other external classifications, not involved in the estimation process. Both procedures use the same auxiliary variables.

The rest of the paper is organised as follows. Section 2 describes the study region and the data. Subsection 2.1 includes specific subsections for MODIS remote sensing data, spectral indices and additional products. Subsection 2.3 elucidates the process of acquiring valuable data and constructing the input dataset. It encompasses explanations of the differences of spectral indices, the average density of active fires and the definition of the reference classification. XGBoost and LR are briefly explained in the context of our data analysis in Subsects. 3.1 and 3.2 of Sect. 3, respectively. The final results, and the accuracy metrics obtained for the validation process of XGBoost and LR, are shown in Sect. 4. Finally, the paper ends with some conclusions.

2 Study region and data

The region of interest covers several Iberian peninsula regions including Galicia and the Portuguese regions of Santarém, Braga, Vila Real, Coimbra, Guarda, Aveiro, Viseu, Castelo Branco, Portalegre, Braganca, Porto and Viana do Castelo, with an extension of about 84,348 km$^2$. Figure 1 illustrates the extent of burned areas within the study region over the designated time period. Galicia is a region of roughly 30,000 km$^2$ located in the north of Spain and above Portugal, that concentrates a high number of fires in Spain. In 2017, approximately 80% of the 620 km$^2$ area that burned in Galicia occurred within a span of 2 days. During this period, more than 20 fires resulted in burned surfaces exceeding 5 km$^2$. Portugal, with a land area of about 92,090 km$^2$, consists of over 66% forested land. In 2017, Portugal lost by wildfires the greatest area in 1 year, more than 5000 km$^2$ (Turco et al. 2019b). It is the European country most affected by fires during the last decade (San-Miguel-Ayanz et al. 2020).

Regarding the region of interest, multiple variables have been generated from various data sources. These sources provide information in a variety of formats, including both vectorial and raster formats. The data are standardized into a stack of rasterized images and projected with the MODIS mission format, denoted as SR-ORG:6974. Most of the data used in this work is derived from multi-spectral satellite images, explicitly capturing information related to burned areas; this data is presented in Subsect. 2.1. The remainder of the section includes additional information obtained primarily from vectorial sources in Subsect. 2.2. Finally, in Subsect. 2.3, the data processing for extracting valuable information are described.

2.1 Multi-spectral data

The study requires a substantial amount of data, and MODIS provides a significantly larger variety of variable types related to burned areas compared to other satellite programs. The MODIS program has two satellites, Terra and Aqua, both of which capture daily images of the earth surface at 500 m spatial resolution. Both satellites cross the same orbit with a 3 h lag, allowing them to complement the missing values from each other.

The download and data loading into the R software (R Core Team 2023) was performed using the rsat package (Pérez-Goya et al. 2021). This package assembles the images by the region and time of interest in an object that contains images covering the region of interest for the 91 days of September, October and November 2017. Using rsat, we import a total of 182 layers for each spectral index. This comprises 91 daily layers from MOD09GA and an additional 91 daily layers from MYD09GA, sourced from Terra and Aqua satellites, respectively.

2.1.1 Spectral indices

Table 1 shows the definition of the spectral indices, and the pre-fire and post-fire differences used in this study. The normalized burn ratio (NBR) is the most popular spectral burn index, originally developed for identifying burned areas (García and Caselles 1991), and later used for burn and fire severity assessment (Lutes et al. 2006). It is defined with the near-infrared and the shortwave-infrared indices, both sensitive to burning but in opposite way. Other indices, such as the normalized burn ratio 2 (NBR2) (Santana et al. 2018), the burn-sensitive vegetation index (MVI) (Giglio et al. 2009), and the mid-infrared bispectral index (MIRBI) (Trigg and Flasse 2001; McCarley et al. 2018), are also highly responsive to variations in live green vegetation or moisture content. The normalized difference vegetation index (NDVI) is a well known indicator of vegetation. A zero value means no vegetation, and a value near 1 indicates high level of vegetation. We also use the near infrared (NIR) band 2 of MODIS, commonly used for monitoring temporal burn signatures (Tucker 1979; Mohler and Goodin 2010). Values close to zero are associated with unburned areas. All of these indices decrease significantly after a fire, becoming good indicators of burned pixels (Chen et al. 2011). More comprehensive descriptions of the implications of spectral indices in the context of burned area monitoring can be found in the literature (Pereira 1999; Libonati et al. 2010, 2011).

Table 1 Definition of the spectral indices in terms of the red (R), near infrared (NIR), shortwave-infrared (SWIR1, SWIR2) and thermal infrared (TIRS1) bands; pre-fire and post-fire differences of spectral indices identified with the suffixes pre and post respectively

Full size table

Data obtained from satellites may contain invalid information due to cloud cover. The Terra and Aqua daily cloud masks are used to remove the cloudy observations from the classification process. This work reduces cloud gaps by using composite images of indices. These images are given by covering the daily indices defined with MOD09GA with the corresponding daily indices defined with MYD09GA. It means that we substitute unavailable pixels of MOD09GA indices with available MYD09GA pixels for the corresponding indices, reducing the number of unavailable or erroneous pixels. Several Gap-filling methods have been developed to solve this issue (Militino et al. 2019a, b; Wang et al. 2022).

2.2 Additional products

Apart from using multispectral indices, supplementary products incorporating additional data have been employed to enhance the detection of wildfires. One of those is the MCD12Q1 land cover product which provides yearly land cover maps at a spatial resolution of 500 m, featuring 17 classification legends. The first 11 categories correspond to different types of vegetation, while the remaining categories encompass a variety of croplands, urban and built-up lands, and water bodies. This classification enables the derivation of a binary variable to determine burnability. If a pixel corresponds to one of the first 11 categories, it is considered burnable, whereas pixels falling outside of these categories are deemed unburnable.

The study also includes specific products related to the fires and burned areas:

1.
Fire location products. Fire location products allow us to find the date of the nearest fire to compute the pre-fire and post-fire differences. They also improve the detection of burned areas by incorporating features that include spatial information, such as the distance to the nearest fire and the intensity of the point process of active fires.
1. (a)
  The fire location product MCD14DL of October 2017. This is a monthly product of near real-time (NRT) MODIS Thermal Anomalies or fire locations representing the center of a 1 km spatial resolution pixel.
2. (b)
  The visible infrared imaging radiometer suite (VIIRS) of October 2017. It detects active fires and other thermal anomalies (VIIRS 2021) providing a means to identify fire-induced changes in surface reflectance (Loboda et al. 2007). VIIRS data complements and enhances MODIS (MCD14DL) fire detection (NASA 2020).
Both products provide vector files. Then, to use them in addition to the spectral indexes, they are reprojected into the grid that defines the MOD09GA images.
2.
Burned area products. Burned area products serve a dual purpose. Firstly, they enable us to establish a reference classification for our methods. Secondly, they facilitate the validation of our classification models’ performance.
1. (a)
  The EFFIS wildfire data base of October 2017. It contains perimeters of burned areas in Europe since 2003 in vector files (EFFIS 2021). This product is derived from the daily processing of MODIS satellite imagery at 250 m ground spatial resolution. The product is reprojected into the MOD09GA grid and used as a reference for the classification methods. The EFFIS reference classification is a binary variable where the burned pixels are those covered by the EFFIS burned areas. Building upon this reference, we create a refined classification reference, labeled as Lclass, for use in the validation process. Lclass removes atypical data from EFFIS classification, and identifies as burned pixels only those of the EFFIS burned areas greater than 2 km$^2$, and with $d.NBR1 > 0.1$. The unburned pixels are those not defined as burned in the previous step, but with $d.NBR1 \le 0.15$ for avoiding isolated burn scars not identified in the EFFIS database (Lutes et al. 2006).
2. (b)
  The MCD64A1 product. It provides burned area data at a 500 m resolution grid. Even if the format is similar, we still need to reproject the MCD64A1 images into the MOD09GA grid. This product has the sole purpose of validating our results.
3. (c)
  The FireCCI5.1 product. It provides burned area data at a 250 m resolution grid. Even if this product gives images, we still need to reproject them into the MOD09GA grid. This product is used for validation purposes only.

2.3 Input variables for the classifiers

Classification methods require a reference classification, which serves as the dependent or target variable, along with auxiliary or predictor variables. The reference classification is derived from EFFIS burned area data, while the auxiliary variables are obtained from differences of spectral indices, the distance of each pixel to the nearest fire (distAF) and the average intensity of active fires (aF.int).

2.3.1 Differences of spectral indices and distances to the nearest active fire

Differences of spectral indices are a frequent tool in change detection algorithms for identifying burned areas (Van Wagtendonk et al. 2004; Miller et al. 2009). The difference process consists in subtracting the index value posterior to the fire (post) from the index value previous to the detected fire (pre) (Eidenshink et al. 2007). Using the vector files of active fires of October 2017 from MDC14DL and VIIRS products in the region of interest, we obtain the nearest active fire date using the Dirichlet tessellation, and we define the distance (distAF) of every pixel to the nearest active fire. The Dirichlet tessellation creates a polygon around a center point where any other point inside the polygon is nearest to the center point than any other point. Next, we assign to all pixels of the same polygon the date of the closer fire. Fire dates are used for identifying the eight previous and the eight posterior observations for every pixel. These images are drawn from the time series of composite indices defined between September and November 2017. Next, we calculate the difference indices subtracting the mean of the eight posterior dates of fires from the mean of the eight previous dates. The amplitude of 8 days is empirically the most suitable for time series of MODIS images (Giglio et al. 2018).

2.3.2 Average density of active fires

Fire locations are usual examples of point processes (Borrajo et al. 2020), because they are realizations of a random point process in a two-dimensional space (Baddeley et al. 2015). The reference model of a point process is a uniform or homogeneous Poisson point process, where the number of points in a region A follows a Poisson distribution with mean $\lambda *area(A)$, where $\lambda$ is the intensity of the process, defined as the expected number of points by unit area. When the point process is not homogeneous, such as the case in Portugal or Spain where clusters of municipalities present a higher frequency of wildfires (Martinho 2018), the intensity can be effectively modeled by incorporating spatial coordinates (u) through linear, generalized linear or generalized additive models (gam). To gain model flexibility a gam model is used here. The similarity found between the Poisson log-likelihood and the linear Poisson regression, allows the intensity to be expressed as log-linear in the parameter $\theta$. Namely,

$$\begin{aligned} \text {log}\, \lambda _{\theta }(u)=S(u), \end{aligned}$$

where S(u) is a smooth function of the coordinates u. In this case, we use a thin-plate basis function of dimension $k=30$ (Turner 2009). Figure 2 shows the estimation of the average density computed with the R package spatstat (Baddeley and Turner 2005).

3 Classifiers

The classifiers allow a supervised classification of burned and unburned pixels. We analyze the dataset using extreme gradient boosting and logistic regression.

The input file is obtained by generating a text file from the raster dataset, and thus, it has the following variables: the differences of spectral indices called d.NBR2, d.MVI, d.MIRBI, d.NDVI and d.NIR, the average density of active fires by pixel (aF.int), the distance to the nearest active fire (distAF), and the reference classification as dependent or target variable. It has around 500,000 observations.

3.1 The eXtreme Gradient Boosting method (XGBoost)

XGBoost (Chen and Guestrin 2016) is an advanced implementation of the gradient boosting method with many applications in Earth Sciences (Sahin 2022). It is an ensemble learning method and supervised algorithm, where a single model combines the predictive power of multiple learners. The main ensemble learners are boosting and bagging, both usually based on decision trees, that predict the target variable through several input features. Boosting works with sequential trees reducing errors from previous trees, and it is appropriate for managing large sets of data without specific assumptions. The main advantages of the decision trees are the relative simple structure, the lack of assumptions, and the flexibility and robustness with regard to other methods (Alnahit et al. 2022). Decision trees can effectively deal with nonlinear relationships and diverse variable types, including both categorical and numerical variables. The main difference with other bagging methods such as random forest, is that boosting uses trees with few splits. In the training step, the parameters of the weak learner are fitted iteratively minimizing an objective function. In this application, every learner is compared with its previous learners to minimize the binary classification rate computed as the ratio of the number of wrong cases over the total number of cases.

XGBoost randomly chooses a training set of 75% of observations and uses a tenfold cross validation over the training set to estimate the best hyperparameters. The optimized model is obtained for the hyperparameters achieving the minimum mean error among the folds. The main hyperparameters are: (1) ‘Learning rate’, that scales the contribution of each tree by a factor to prevent overfitting and can make the boosting process more conservative. It varies between 0.1 and 0.5. The optimum is 0.1. (2) ‘Maximum depth of a tree’, that controls the use of deeper trees, generating more complex models. It varies among 1, 5 and 10. Higher depth will result in more complex models, which are more likely to overfit. The optimum is 1. (3) ‘Minimum sum of instance weight needed in a child’, that provides minimum weights for further partitioning. If the tree partition step results in a leaf node with the sum of instance weight less than this weight, then the building process will give up further partitioning. It varies among 1, 3, 5, 7 and 9. The optimum is 7. (4) ‘Control of imbalanced classes’, that is fixed for the training set. This is a very specific hyperparameter that makes XGBoost more competitive than other machine learning methods in burned area detection. It is defined as the ratio of unburned over the burned pixels, i.e. 27.88. XGBoost has been implemented with the xgboost R package (R Core Team 2023), yet we have also used rsat (Pérez-Goya et al. 2021) and dependent packages for downloading, customizing and managing the images, vector and text files.

3.2 Logistic regression (LR)

LR is a popular statistical method for supervised classification (Hosmer et al. 2013) that predicts the probability of belonging to a binary class. Fitting LR models requires several assumptions: (a) a binary response variable, (b) independent observations, (c) absence of multicollinearity among explanatory variables, (d) no extreme outliers and (e) a linear relationship between explanatory variables and the response variable (James et al. 2013). The assumptions are accomplished as follows. The training data set is the same as for XGBoost, and consists in a random choice of 75% of the observations. Therefore, choosing random data relaxes the assumption of independence. The variance inflation factor (vif) allows quantifying the effects of multicollinearity. In this application all predictor variables have a vif less than 5 (far from the limit of 10). We have checked the linearity of the continuous independent variables and their logit (log odds) with the Box–Tidwell test (1962), and the absence of outliers.

In case of very imbalanced data, the LR can underestimate seriously the probability of success (burned pixels) (King and Zeng 2001). Then, the minority group will get a high sensitivity rate and a lower specificity rate. Using an adequate sampling scheme and a weighted procedure we can compensate the differences of successes and failures. This is made with undersampling and weighting (Haixiang et al. 2017). Weights are assigned to data for compensating differences between successes (burned pixels) and failures (unburned pixels). Undersampling consists in drawing at random a number of successes similar (or equal) to the failures, that is, at a ratio of 1:27.88.

The probability of burned pixels ($Y=1$) is given by

$$\begin{aligned} \pi=P(Y=1\mid X_1,X_2,\ldots ,X_8)=\dfrac{1}{1+\exp ^{(\beta _0+\sum _{k=1}^8 \beta _k X_k)}} \end{aligned}$$

that can also be expressed as

$$\begin{aligned} \log \left(\dfrac{\pi}{1-\pi}\right)=\beta _0+\beta _1X_1+\beta _2X_2+\cdots +\beta _8 X_8, \end{aligned}$$

where $\{\beta _i \mid i=1,\ldots ,n\}$ are the coefficients given in Table 2 and fitted by maximum likelihood through an iteratively weighted least squares algorithm. All of the predictor variables are statistically significant except for the difference spectral index d.NBR2. We do not exclude it to keep the same auxiliary variables in both methods. The convergence is reached in a few iterations in less than 1 min.

Table 2 Coefficients, estimated coefficients, standard errors, z-values and p-values obtained for the variables of the training set with LR

Full size table

3.3 Confusion matrices

The main accuracy metrics used to evaluate the classifiers are shown in Table 3. The true positives (TP) indicate the matches of pixels defined and predicted as burned pixels. The false negatives (FN) are those pixels that are defined as burned pixels but predicted as unburned. The false positives (FP) are those defined as unburned pixels but predicted as burned, and finally the true negatives (TN) are those defined and predicted as unburned pixels. Therefore, the detection rate (D) is the proportion of correctly defined and predicted burned pixels over the set of pixels, the omission error (OE) is the proportion of incorrectly predicted burned pixels over the burned reference set, and the commission error (CE) is the proportion of incorrectly predicted burned pixels over the burned predicted set. The precision (P) is the proportion of correctly predicted burned pixels over the burned pixels. It is also the complement of the commission error $(P=1-CE)$. The recall, also called sensitivity or true positive rate, is the proportion of true burned pixels over the burned reference set. It is the complement of the omission error ($R=1-OE$). The Dice coefficient (DC) is a similarity measure lying between 0 and 1. It is defined as double of the overlapping area divided by the total number of pixels in both images (Guindon and Zhang 2017). The kappa coefficient $(\kappa )$ measures the agreement between the reference and the predicted classification.

Table 3 Definition of accuracy metrics, where TP, TN, FP, and FN are true positives, true negatives, false positives and false negatives, respectively

Full size table

4 Results

XGBoost and LR learn and estimate from the same random training data set of 75% of the pixels. Both use corrections to compensate the very imbalanced data using appropriate weights, but inevitably results can vary due to the randomness of the approximation methods. For avoiding uncertainties, we run both procedures 100 times and derive the accuracy metrics. Negligible differences are found among different runs. Removing the weights for imbalanced data results in an increase in misclassifications. The final predictions of both methods are classified as either 0 or 1, depending on whether the predicted probability is less than or equal to 0.5 or greater than 0.5, respectively.

Table 4 gives the means of the TP, TN, FP and FN proportions obtained when comparing the predicted classification of XGBoost and LR with four different classifications: the reference (EFFIS), the re-defined Lclass, the MCD64A1 and the FireCCI5.1 classification. The EFFIS classification is used as reference in the estimation process, Lclass is a refined classification from EFFIS, MCD64A1 and FireCCI5.1 are classifications based on MODIS products not involved in the estimation process. For the EFFIS reference, the proportions are calculated over the test set (25% of the input dataset) of 126,233 pixels, and in the rest the means are calculated over the total (100%) of 504,933 pixels in 100 runs. The results obtained for EFFIS and Lclass with XGBoost and LR are a bit better than the ones obtained for MCD64A1 and FireCCI5.1 as expected. FN and FP rates are lower in EFFIS and Lclass classifications in both classifiers, but XGBoost tends to provide more FP and less FN than LR in all the scenarios.

Table 4 Proportions of true positives (TP), true negatives (TN), false negatives (FN) and false positives (FP) calculated with the means of 100 runs of XGBoost and LR when compared with the EFFIS reference, the redefined Lclass, the MCD64A1, and the FireCCI5.1 classification

Full size table

Table 5 shows the means of the accuracy metrics estimated with 100 runs of LR and XGBoost predictions in the four different scenarios already defined. The EFFIS reference has been made with the test set, while the rest of references are obtained with the complete dataset (training and testing). The highest detection rates are provided by XGBoost in the EFFIS (0.033) and Lclass (0.032) classifications, yet LR detection rates are very similar, 0.032 and 0.031 respectively. In all the metrics, the comparison of the predictions with EFFIS and Lclass classifications is more successful than the predictions with the MCD64A1 and FireCCI5.1 products as expected, because these products are not based upon the reference classification. LR is better in Precision and provides a lower number of FP, while XGBoost is better in Recall and provides a lower number of FN in all the scenarios. The similarity between predictions and all the classifications is higher in LR than in XGBoost, as Dice and $\kappa$ coefficients show.

Table 5 Means of estimated accuracy metrics of the 100 runs of logistic and XGBoost predictions vs. the EFFIS reference, the redefined Lclass, the MCD64A1 and the FireCCI5.1 classification

Full size table

Figures 3 and 4 show the mean of the classifications provided by XGBoost and LR, respectively vs. the reference, the redefined Lclass, the MCD64A1 and the FireCCI5.1 products in the region of interest. NA pixels, plotted in white, are missing or un-burnable data, the green pixels are the true unburned pixels (TN), roughly 96% of the pixels. The true burned pixels (TP, in blue) represent approximately 3% of the pixels. All panels show a strong coincidence of location and identification. Only 1% of false negatives (FN, plotted in red) and false positives (FP, plotted in yellow) are observed as misclassified pixels. The FP (see Fig. 3) are mainly in the border regions of burn scars, and more frequent in the MCD64A1 and FireCCI5.1 classification. The FN are sparsely distributed (see Fig. 4). Both classifiers have better performance when comparing with the EFFIS and Lcass classifications than when comparing with the MCD64A1 and FireCCI5.1 classifications (Fig. 5).

The importance assessment of XGBoost is shown in Fig. 5, where d.NBR1 has the highest contribution to the predicted classification, followed by d.MVI, distAF, d.NBR2, d.NIR, d.NDVI, d.MIRBI and aF.int. This rank of contribution is expected, because NBR1 is one of the most popular burn index and MVI is the used index in MODIS BA products (Giglio et al. 2018). The rest of variables have lower percentage gain, but they are also crucial for detecting burned areas because they improve the classification process. Specifically, the average density of active fires (aF.int) as predictor variable reduces both the number of FN (2%) and the number of FP (13%) in the estimated confusion matrices.

Figures 6 and 7 zoom the highlighted region of Figs. 3 and 4, respectively. The blue burn scars (TP) are well identified in the four scenarios of both methods, yet in the reference and Lclass classifications identification of burned pixels is higher than in the MCD64A1 and FireCCI5.1 products as can be expected since the detection rate is 10% higher in those cases (see Table 5). More specifically, Fig. 7 shows that LR has a slightly number of FN (plotted in red), but a lower number of FP (plotted in yellow) with regard to XGBoost.

5 Conclusions

In this work, we evaluate a machine learning algorithm called the extreme gradient boosting algorithm (XGBoost), that outperforms in many cases other machine learning algorithms when detecting burned areas using satellite images, and logistic regression (LR), a traditional statistical method that, in principle, is not specifically oriented to detect burned areas. Both use the same input set of predictor variables defined with the differences of spectral indices for identifying vegetation changes, the distance of every pixel to the closest active fire, and the average density of active fires by pixel, computed using point processes. Auxiliary variables contribute to a better classification in both methods, in particular the differences of spectral indices, but the distance to the active fires and the average density, are also relevant for identifying burned pixels. In LR, because these variables are statistically significant, and in XGBoost because removing them increases the number on misclassifications. Using weights to mitigate the bias effect of imbalanced data also aids in better identifying true fires.

Conceptualization of XGBoost is different of LR, but both present similar results with pros and cons. XGBoost extracts model-like structure from data, without assuming any type of distribution. LR is a well-known parametric method, that requires some assumptions to be fitted. On the other hand, LR offers better interpretability, and demonstrates greater robustness, as the estimated coefficients remain relatively stable even when changing the training dataset. Moreover, it is highly efficient and significantly faster than XGBoost (from less than 1 min in LR to more than 4 h in XGBoost).

In all classifications, LR has better agreement coefficients (Dice, Overall accuracy and $\kappa$) but a bit smaller Detection rate. XGBoost has lower omission error and higher commission error than LR. In addition, XGBoost exhibits higher differences in omission error and commission error compared to LR. Specifically, XGBoost achieves a very low number of false negatives (FN) but increases false positives (FP) more than LR does when attempting to reduce FP by increasing FN. Consequently, LR slightly outperforms XGBoost in terms of global accuracy metrics. But more importantly, LR emerges as a simple, explainable, computationally efficient, and highly competitive model for classifying large sets of binary data with imbalanced classes. This makes LR an excellent choice for analyzing burned areas using satellite images.

References

Alnahit AO, Mishra AK, Khan AA (2022) Stream water quality prediction using boosted regression tree and random forest models. Stoch Environ Res Risk Assess 36:2661–2680
Article Google Scholar
Arabameri A, Yamani M, Pradhan B, Melesse A, Shirani K, Bui DT (2019) Novel ensembles of COPRAS multi-criteria decision-making with logistic regression, boosted regression tree, and random forest for spatial prediction of gully erosion susceptibility. Sci Total Environ 688:903–916
Article CAS PubMed ADS Google Scholar
Baddeley A, Turner R (2005) Spatstat: an R package for analyzing spatial point patterns. J Stat Softw 12:1–42
Article Google Scholar
Baddeley A, Rubak E, Turner R (2015) Spatial point patterns: methodology and applications with R. Chapman and Hall/CRC Press, London. https://www.routledge.com/Spatial-Point-Patterns-Methodology-and-Applications-with-R/Baddeley-Rubak-Turner/9781482210200/
Bastarrika A, Chuvieco E, Martín MP (2011) Mapping burned areas from Landsat TM/ETM+ data with a two-phase algorithm: balancing omission and commission errors. Remote Sens Environ 115(4):1003–1012
Article ADS Google Scholar
Belgiu M, Drăguţ L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Remote Sens 114:24–31
Article ADS Google Scholar
Borrajo I, González-Manteiga W, Martínez-Miranda D (2020) Testing for significant differences between two spatial patterns using covariates. Spat Stat. https://doi.org/10.1016/J.SPASTA.2019.100379
Article MathSciNet Google Scholar
Box GE, Tidwell PW (1962) Transformation of the independent variables. Technometrics 4(4):531–550
Article MathSciNet Google Scholar
Castelli M, Vanneschi L, Popovič A (2015) Predicting burned areas of forest fires: an artificial intelligence approach. Fire Ecol 11(1):106–118
Article Google Scholar
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. pp 785–794. https://doi.org/10.1145/2939672.2939785
Chen X, Vogelmann JE, Rollins M, Ohlen D, Key CH, Yang L et al (2011) Detecting post-fire burn severity and vegetation recovery using multitemporal remote sensing spectral indices and field-collected composite burn index data in a ponderosa pine forest. Int J Remote Sens 32(23):7905–7927
Article Google Scholar
Chuvieco E, Mouillot F, van der Werf GR, San Miguel J, Tanase M, Koutsias N et al (2019) Historical background and current developments for mapping burned area from satellite earth observation. Remote Sens Environ 225:45–64
Article ADS Google Scholar
de Menezes FS, Liska GR, Cirillo MA, Vivanco MJ (2017) Data classification with binary response through the boosting algorithm and logistic regression. Expert Syst Appl 69:62–73
Article Google Scholar
EEDC (2022) Earth engine data catalog. https://modis.gsfc.nasa.gov/about/. Accessed 19 Sept 2022
EFFIS (2021) European forest fire information system (EFFIS). https://effis.jrc.ec.europa.eu/. Accessed 17 Sept 2022
Eidenshink J, Schwind B, Brewer K, Zhu Z-L, Quayle B, Howard S (2007) A project for monitoring trends in burn severity. Fire Ecol 3(1):3–21
Article Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Article MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407
Article Google Scholar
García ML, Caselles V (1991) Mapping burns and natural reforestation using thematic mapper data. Geocarto Int 6(1):31–37
Article ADS Google Scholar
Giglio L, Loboda T, Roy DP, Quayle B, Justice CO (2009) An activefire based burned area mapping algorithm for the MODIS sensor. Remote Sens Environ 113(2):408–420
Article ADS Google Scholar
Giglio L, Boschetti L, Roy DP, Humber ML, Justice CO (2018) The collection 6 MODIS burned area mapping algorithm and product. Remote Sens Environ 217:72–85
Article PubMed PubMed Central ADS Google Scholar
Guindon B, Zhang Y (2017) Application of the dice coefficient to accuracy assessment of object-based image classification. Can J Remote Sens 43(1):48–61. https://doi.org/10.1080/07038992.2017.1259557
Article Google Scholar
Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239
Article Google Scholar
Hall JV, Argueta F, Giglio L (2021) Validation of MCD64A1 and FireCCI51 cropland burned area mapping in Ukraine. Int J Appl Earth Obs Geoinf 102:102443
Google Scholar
Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. Wiley, Hoboken
Book Google Scholar
Ingwersen EW, Stam WT, Meijs BJ, Roor J, Besselink MG, Groot Koerkamp B et al (2023) Machine learning versus logistic regression for the prediction of complications after pancreatoduodenectomy. Surgery. https://www.sciencedirect.com/science/article/pii/S0039606023001587
Jain P, Coogan SC, Subramanian SG, Crowley M, Taylor S, Flannigan MD (2020) A review of machine learning applications in wildfire science and management. Environ Rev 28(4):478–505
Article Google Scholar
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer, New York
Book Google Scholar
Khanmohammadi S, Arashpour M, Golafshani EM, Cruz MG, Yu Bai AR (2022) Prediction of wildfire rate of spread in grasslands using machine learning methods. Environ Model Softw 156:10507. https://doi.org/10.1016/j.envsoft.2022.105507
Article Google Scholar
King G, Zeng L (2001) Logistic regression in rare events data. Polit Anal 9(2):137–163
Article Google Scholar
Koutsias N, Karteris M (2000) Burned area mapping using logistic regression modeling of a single post-fire Landsat-5 thematic mapper image. Int J Remote Sens 21(4):673–687
Article Google Scholar
Libonati R, DaCamara CC, Pereira JMC, Peres LF (2010) Retrieving middle-infrared reflectance for burned area mapping in tropical environments using MODIS. Remote Sens Environ 114(4):831–843. https://doi.org/10.1016/j.rse.2009.11.018
Article ADS Google Scholar
Libonati R, DaCamara CC, Pereira JMC, Peres LF (2011) On a new coordinate system for improved discrimination of vegetation and burned areas using MIR/NIR information. Remote Sens Environ 115(6):1464–1477. https://doi.org/10.1016/j.rse.2011.02.006
Article ADS Google Scholar
Lizundia-Loiola J, Otón G, Ramo R, Chuvieco E (2020) A spatiotemporal active-fire clustering approach for global burned area mapping at 250 m from MODIS data. Remote Sens Environ 236:111493
Article Google Scholar
Loboda T, O’neal K, Csiszar I (2007) Regionally adaptable dNBR-based algorithm for burned area mapping from MODIS data. Remote Sens Environ 109(4):429–442
Article ADS Google Scholar
Lutes D, Keane RE, Caratti J, Key C, Benson N, Sutherland S, Gangi LJ (2006) Landscape assessment:ground measure of severity, the composite burn index; and remote sensing of severity, the normalized burn ratio. FIREMON: Fire Effects Monitoring and Inventory System. USDA Forest Service, Rocky Mountain Research Station, Ogden, UT 1:1–51
Martinho VJPD (2018) Forest fires across Portuguese municipalities: zones of similar incidence, interactions and benchmarks. Environ Ecol Stat 25:405–428
Article MathSciNet CAS ADS Google Scholar
Mas JF, Flores JJ (2008) The application of artificial neural networks to the analysis of remotely sensed data. Int J Remote Sens 29(3):617–663
Article Google Scholar
McCarley TR, Smith AM, Kolden CA, Kreitler J (2018) Evaluating the Mid-Infrared Bi-spectral index for improved assessment of low-severity fire effects in a conifer forest. Int J Wildland Fire 27(6):407–412
Article Google Scholar
Militino AF, Ugarte M, Montesino M (2019a) Filling missing data and smoothing altered data in satellite imagery with a spatial functional procedure. Stoch Environ Res Risk Assess 33(10):1737–1750
Article Google Scholar
Militino AF, Ugarte MD, Pérez-Goya U, Genton MG (2019b) Interpolation of the mean anomalies for cloud filling in land surface temperature and normalized difference vegetation index. IEEE Trans Geosci Remote Sens 57(8):6068–6078. https://doi.org/10.1109/TGRS.2019.2904193
Article ADS Google Scholar
Miller JD, Knapp EE, Key CH, Skinner CN, Isbell CJ, Creasy RM, Sherlock JW (2009) Calibration and validation of the relative differenced Normalized Burn Ratio (RdNBR) to three measures of fire severity in the Sierra Nevada and Klamath mountains, California, USA. Remote Sens Environ 113(3):645–656
Article ADS Google Scholar
Mohler RL, Goodin DG (2010) A comparison of red, NIR, and NDVI for monitoring temporal burn signature change in tallgrass prairie. Remote Sens Lett 1(1):3–9
Article Google Scholar
NASA (2020) Fire information for resource management system. https://firms.modaps.eosdis.nasa.gov/map/. Accessed 25 Aug 2021
Pereira JM (1999) A comparative evaluation of NOAA/AVHRR vegetation indexes for burned surface detection and mapping. IEEE Trans Geosci Remote Sens 37(1):217–226
Article ADS Google Scholar
Pérez-Goya U, Montesino-SanMartin M, Militino AF, Ugarte MD (2021) rsat: dealing with multiplatform satellite images from Landsat, MODIS, and Sentinel. R package version 0.1.16. https://github.com/ropensci/rsat
Petropoulos GP, Kontoes C, Keramitsoglou I (2011) Burnt area delineation from a uni-temporal perspective based on Landsat tm imagery classification using support vector machines. Int J Appl Earth Obs Geoinf 13(1):70–80
Google Scholar
Pinto MM, Libonati R, Trigo RM, Trigo IF, DaCamara CC (2020) A deep learning approach for mapping and dating burned areas using temporal sequences of satellite images. ISPRS J Photogramm Remote Sens 160:260–274
Article ADS Google Scholar
R Core Team (2023) R: a language and environment for statistical computing. Vienna, Austria. https://www.R-project.org/
Ramampiandra EC, Scheidegger A, Wydler J, Schuwirth N (2023) A comparison of machine learning and statistical species distribution models: quantifying overfitting supports model interpretation. Ecol Model 481:110353
Article Google Scholar
Ramo R, Chuvieco E (2017) Developing a random forest algorithm for MODIS global burned area classification. Remote Sens 9(11):1193
Article ADS Google Scholar
Rizeei HM, Pradhan B, Saharkhiz MA, Lee S (2019) Groundwater aquifer potential modeling using an ensemble multi-adoptive boosting logistic regression technique. J Hydrol 579:124172
Article Google Scholar
Sahin EK (2022) Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping. Geocarto Int 37(9):2441–2465. https://doi.org/10.1080/10106049.2020.1831623
Article ADS Google Scholar
San-Miguel-Ayanz J, Oom D, Artes T, Viegas D, Fernandes P, Faivre N et al (2020) Forest fires in Portugal in 2017. Publications Office of the European Union
Santana NC, de Carvalho Júnior OA, Gomes RAT, Guimarães RF (2018) Burned-area detection in Amazonian environments using standardized time series per pixel in MODIS data. Remote Sens 10(12):1904
Tomshin O, Solovyev V (2021) Spatio-temporal patterns of wildfires in Siberia during 2001–2020. Geocarto Int. https://doi.org/10.1080/10106049.2021.1973581
Article Google Scholar
Trigg S, Flasse S (2001) An evaluation of different bi-spectral spaces for discriminating burned shrub-savannah. Int J Remote Sens 22(13):2641–2647
Article Google Scholar
Tucker CJ (1979) Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens Environ 8(2):127–150
Article ADS Google Scholar
Turco M, Herrera S, Tourigny E, Chuvieco E, Provenzale A (2019a) A comparison of remotely-sensed and inventory datasets for burned area in Mediterranean Europe. Int J Appl Earth Obs Geoinf 82:101887
Google Scholar
Turco M, Jerez S, Augusto SEA (2019b) Climate drivers of the 2017 devastating fires in Portugal. Sci Rep 9:13886. https://doi.org/10.1038/s41598-019-50281-2
Article CAS PubMed PubMed Central ADS Google Scholar
Turner R (2009) Point patterns of forest fire locations. Environ Ecol Stat 16:197–223
Article MathSciNet Google Scholar
Van Wagtendonk JW, Root RR, Key CH (2004) Comparison of AVIRIS and Landsat ETM+ detection capabilities for burn severity. Remote Sens Environ 92(3):397–408
Article ADS Google Scholar
Vetrita Y, Cochrane MA, Priyatna M, Sukowati KA, Khomarudin MR et al (2021) Evaluating accuracy of four MODIS-derived burned area products for tropical peatland and non-peatland fires. Environ Res Lett 16(3):035015
Article CAS ADS Google Scholar
VIIRS (2021) Visible infrared imaging radiometer suite. https://www.star.nesdis.noaa.gov/jpss/VIIRS.php. Accessed 9 Sept 2022
Wang Q, Wang L, Zhu X, Ge Y, Tong X, Atkinson PM (2022) Remote sensing image gap filling based on spatial-spectral random forests. Sci Remote Sensi 5:100048
Article Google Scholar
Zhang R, Qu JJ, Liu Y, Hao X, Huang C, Zhan X (2015) Detection of burned areas from mega-fires using daily and historical MODIS surface reflectance. Int J Remote Sens 36(4):1167–1187
Article CAS Google Scholar

Download references

Funding

Open Access funding provided by Universidad Pública de Navarra. This work has been funded by the project PID 2020-113125RB-I00 of the Spanish Research Agency (MCIN/ AEI/10.13039/501100011033) and Ayudas predoctorales UPNA 2022-2023.

Author information

A. F. Militino, H. Goyena, U. Pérez-Goya, and M. D. Ugarte have contributed equally to this work.

Authors and Affiliations

Department of Statistics, Computer Science and Mathematics, Public University of Navarre, Campus de Arrosadia, 31006, Pamplona, Spain
A. F. Militino, H. Goyena, U. Pérez-Goya & M. D. Ugarte
Institute for Advanced Materials and Mathematics (InaMat2), Public University of Navarra, Campus de Arrosadia, 31006, Pamplona, Navarre, Spain
A. F. Militino, H. Goyena, U. Pérez-Goya & M. D. Ugarte

Authors

A. F. Militino
View author publications
You can also search for this author in PubMed Google Scholar
H. Goyena
View author publications
You can also search for this author in PubMed Google Scholar
U. Pérez-Goya
View author publications
You can also search for this author in PubMed Google Scholar
M. D. Ugarte
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AFM, and MDU: methodology, writing the main manuscript. HG and AFM: data curation, code, figures. UP-G: data curation, code revision. All authors reviewed the manuscript.

Corresponding author

Correspondence to M. D. Ugarte.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Handling Editor: Daniela Cocchi.

Appendices

Appendix 1: code availability section

Name of the code/library: LXG

Contact: e-mail and phone number: harkaitz.goyena@unavarra.es (+34)948168965

Hardware requirements: A PC windows computer with an Intel(R) Core(TM) i7-6700 @3.40GHz processor and 16 GB

Program language: R 4.2.2

Software required: R (https://www.R-project.org/)

Program size: 50 KB. The full repository: 360 MB

Source codes: https://github.com/spatialstatisticsupna/LXG

License: GNU General Public License, version 3 (SPDX short identifier: GPL-3.0)

Appendix 2: data availability statement

The data that support the findings of this study are available in LXG at

https://github.com/spatialstatisticsupna/LXG.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Militino, A.F., Goyena, H., Pérez-Goya, U. et al. Logistic regression versus XGBoost for detecting burned areas using satellite images. Environ Ecol Stat 31, 57–77 (2024). https://doi.org/10.1007/s10651-023-00590-7

Download citation

Received: 15 June 2023
Accepted: 21 November 2023
Published: 20 January 2024
Issue Date: March 2024
DOI: https://doi.org/10.1007/s10651-023-00590-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Logistic regression versus XGBoost for detecting burned areas using satellite images

Abstract

Similar content being viewed by others

A comparative evaluation of state-of-the-art ensemble learning algorithms for land cover classification using WorldView-2, Sentinel-2 and ROSIS imagery

On the ensemble of multiscale object-based classifiers for aerial images: a comparative study

Relative performance evaluation of machine learning algorithms for land use classification using multispectral moderate resolution data

1 Introduction