Advertisement

Environmental and Ecological Statistics

, Volume 25, Issue 3, pp 325–340 | Cite as

Estimating wildfire growth from noisy and incomplete incident data using a state space model

  • Harry Podschwit
  • Peter Guttorp
  • Narasimhan Larkin
  • E. Ashley Steel
Open Access
Article

Abstract

Wildfire behaviors are complex and are of interest to fire managers and scientists for a variety of reasons. Many of these important behaviors are directly measured from the cumulative burn area time series of individual wildfires; however, estimating cumulative burn area time series is challenging due to the magnitude of measurement errors and missing entries. To resolve this, we introduce two state space models for reconstructing wildfire burn area using repeated observations from multiple data sources that include different levels of measurement error and temporal gaps. The constant growth parameter model uses a few parameters and assumes a burn area time series that follows a logistic growth curve. The non-constant growth parameter model uses a time-varying logistic growth curve to produce detailed estimates of the burn area time series that permit sudden pauses and pulses of growth. We apply both reconstruction models to burn area data from 13 large wildfire incidents to compare the quality of the burn area time series reconstructions and computational requirements. The constant growth parameter model reconstructs burn area time series with minimal computational requirements, but inadequately fits observed data in most cases. The non-constant growth parameter model better describes burn area time series, but can also be highly computationally demanding. Sensitivity analyses suggest that in a typical application, the reconstructed cumulative burn area time series is fairly robust to minor changes in the prior distributions.

Keywords

Data reconciliation Gibbs sampling Isotonic regression Logistic difference equation Missing data State space model Wildfire growth 

1 Introduction

Understanding and quantifying wildfire behaviors is of interest to the fire management and research communities for numerous reasons. Fire managers are often asked to predict how large an incident will grow, how long it will continue, or when rapid growth may threaten public safety or hamper firefighter effectiveness. Fire researchers analyze how these behaviors are influenced by the environment, monitor long-term impacts, and assess existing scientific theories. Regardless of the audience, an often-valuable piece of information is a complete and accurate record of wildfire growth. Fire managers use this information as a training aid, decision-making guide, and prediction tool (Alexander and Thomas 2003) and it is also used by the research community for a variety of purposes including the validation and improvement of spread models (Andrews et al. 2007; Alexander and Thomas 2003; Taylor et al. 2013), estimation of wildfire emissions (Veraverbeke et al. 2015; Turquety et al. 2007; Mangeon et al. 2015; Lavoué et al. 2008) and associated health effects (Moeltner et al. 2013), relating meteorology to extreme growth (Billmire et al. 2014), and identifying correlated wildfire behaviors (Birch et al. 2014). Important quantities such as final burn area (Turetsky et al. 2004), area burned per day (Billmire et al. 2014; Birch et al. 2014; Turner et al. 1994) and the classification of high and low growth days (Finney et al. 2009) are easily calculated from this kind of information. Records of growth throughout an incident’s lifetime are relatively rare in comparison to other kinds of information such as size, occurrence and duration, but are sometimes available from a combination of written and spatially explicit sources.

Written growth records may come from case studies, historical accounts, administrative records (Taylor et al. 2013), newspaper reports (Johnston et al. 2006), and social media services (De Longueville et al. 2009). In the United States, the incident status summary (ICS 209) details information regarding natural disasters occurring on federally owned lands and are a rich source of written growth records for wildfire incidents. ICS 209 reports provide near-daily updates to agency managers who make decisions at broad geographic scales regarding the planning, allocation, and prioritization of resources. Each report contains 47 blocks of information including incident name, date and time, coordinates, management strategy, fuel types, values-at-risk, firefighting resources, estimated costs, containment dates, and incident size. While sometimes possible to construct a complete daily wildfire growth record from ICS 209 reports, they are not compiled for every incident and may be based on rough estimates of burn area.1

Information similar to that contained in written growth records can often be derived from other spatially explicit growth records, which have become increasingly available with technological improvements in burn area mapping. Prior to these advances, spatially explicit growth records had to come from hand-drawn maps based on observations from ground-level (Petersen 2014; Callister et al. 2016), a highly subjective and unreliable process. The use of aerial surveys and later Global Positioning Systems alleviated these limitations somewhat by allowing mappers to quickly and sometimes more accurately map the burn perimeters (Kasischke et al. 2002; Conese and Bonora 2005). Infrared imaging systems that detect wildfires through darkness and smoke (Hirsch 1965) dramatically improved the accuracy of aerial imagery and are incorporated into a variety of sophisticated technologies and products widely used in modern wildfire perimeter mapping (Allison et al. 2016). Satellite instrumentation was first used to monitor wildfire activity in the 1970’s (Taylor et al. 2013) and many other space-based instruments have been launched since then, taking regular samples of burn area and active fire detections globally (Joyce et al. 2009). Satellite coverage is currently sufficient to monitor daily fire activity across North America (McNamara et al. 2004), providing a useful spatially explicit growth record when alternative sources are incomplete (Veraverbeke et al. 2014; Parks 2014) or nonexistent (Zhang et al. 2011).

Observation errors, missing data, and disagreement among sources suggest that no single data source can be trusted as authoritative and we can often improve data quality by integrating information across multiple data sources into a unified, but still uncertain estimate of the growth curve (Magnani and Montesi 2010). We explore the problem of reconstructing growth curves from incomplete and corrupted data in Sect. 2, where we present two state space models for reconciling data from multiple data sources into a coherent growth curve. In Sect. 3, we apply the models to wildfire growth data to assess the relative fit and computational requirements of both models. Section 4 presents the results of a sensitivity analysis that compares the effects of substituting priors. Section 5 closes the paper with a discussion of the potential applications of both models as data reconciliation tools and suggests future research directions.

2 Model assumptions

We address the problem of reconstructing growth curves from incomplete and corrupted data through the use of state space models, which have separate components to describe the underlying process and the observations. Specifically, we assume the actual burn area over time, \({\mathbf {x}=\{}X_{0},X_{1}\ldots ,X_{T-1}\}\), is the underlying process and follows deterministically from a growth model that accepts parameters in the vector \({\theta }\), and we assume the observations from data source \(i\in \{1,2,\ldots ,N\}\), denoted \(\mathbf {y}_{i}=\{Y_{i0},Y_{i1}, \ldots ,Y_{i T-1}\}\), are noisy realizations of the underlying process and are related via probability distributions that accept parameters in the vector \(\gamma _{\mathrm {i}}\) (Godsill et al. 2004).

Given known aspects of wildfire behavior and growth records, we assume the growth model to be discrete-time, non-decreasing, non-negative, and sigmoidal. The standard Beverton–Holt difference equation, described by Eq. (1), follows these constraints and we assume it to be a reasonable model of the process.
$$\begin{aligned} X_{t}=\frac{r_{t-1}X_{t-1}}{1+(r_{t-1}-1)\,X_{t-1}}. \end{aligned}$$
(1)
The resulting growth curve is controlled by the inherent growth parameters, \(r_{t}\) and the initial condition, \(X_{0}\) (Beverton and Holt 1957; De La Sen 2008), and if \(0<X_{0}< 1\) and \(r_{t}\ge 1\), then the process will be discrete, begin at size \(X_{0}\), and approach size one according to a sigmoidal function.
We present two versions of the standard Beverton–Holt difference equation: the constant growth parameter (CGP) model and the non-constant growth parameter (NGP) model. The CGP model uses only one inherent growth parameter to describe the distribution of growth over the wildfires’ lifetime, assuming an underlying process that is a discrete-time analog of the logistic equation (Berezansky and Braverman 2004). The NGP model has a time-varying inherent growth parameter, producing curves that better describe processes which deviate from the simple sigmoid growth curve. The NGP model allows time-varying growth by letting the inherent growth parameter follow the difference equation described in Eq. (2),
$$\begin{aligned} r_{t}=1+(r-1)\omega _{t}. \end{aligned}$$
(2)
Here the shifted inherent growth parameter is multiplied by lognormal noise, \(\omega _{t}\), that has geometric mean 1 and geometric standard deviation \(\tau \). Note that for a process of length T, the last value of the inherent growth parameter, \(r_{T-1}\), is arbitrary and we set \(X_{T-1}=1\). Also note that the number of parameters in \({\theta }\) depends on the growth model under consideration, with the CGP model accepting \(\theta _{CGP}=(X_{0},r)\) and the NGP model accepting \(\theta _{NGP}=(X_{0},r_{0:T-1})\).
The observation equations can relate the actual burn area to our data through deterministic or stochastic means, with rescaling procedures being an example of the former and regression models the latter. We assume a stochastic relationship between the process and observations, with multiplicative rather than additive observation errors due to the presumed nonnegativity of burn area data and the use of proportional errors in existing validation studies (Kolden et al. 2012; Sparks et al. 2015). We also assume that observation errors are larger during periods of high-growth than in low-growth and weigh the observations according to Eq. (3).
$$\begin{aligned} Y_{it}\sim \mathrm {Lognormal}(\ln (K_{i}X{}_{t}), \sigma \sqrt{X_{t}/X_{t-1}}). \end{aligned}$$
(3)
Here \(K_{i}\) is the final burn area parameter for data source i and rescales the process. The observation error parameter, \(\sigma \), represents the geometric standard deviation of observations generated from a wildfire that is not growing. The observation equations have the same structure in the CGP and NGP models with a final burn area parameter specific to each data source and a common observation error parameter.
The parameter values are not known with certainty, therefore, we describe them with probability distributions, or priors, that represent our beliefs regarding their values.
$$\begin{aligned}&X_{0}\sim \mathrm {U}((1+r^{1/2})^{-1}, (1+r^{T-3/2})^{-1}). \end{aligned}$$
(4)
$$\begin{aligned}&m\sim \mathrm {Beta}(2.37,1.65). \end{aligned}$$
(5)
$$\begin{aligned}&\sigma \sim \mathrm {U}(3.219\times 10^{-7},1.175). \end{aligned}$$
(6)
$$\begin{aligned}&\mathrm {ln}(K/404)\sim \mathrm {Gen.Gamma}(0.396,0.298,2.357). \end{aligned}$$
(7)
$$\begin{aligned}&\tau \sim \mathrm {Lognormal}(-6.827,5.214). \end{aligned}$$
(8)
The distribution shown in Eq. (4) constrains the possible growth curves so that the largest daily size increment occurs during the reconstructed burn area time series. The CGP model forces the time step of the largest daily size increment, \(\arg \max _{t}(X_{t}-X_{t-1})\), to an integer, \(t_{max}\ge 1\), by initializing at \(X_{0}=(1+r^{(t_{max}-1/2)})^{-1}\). Assuming that the largest daily size increment occurs within the observation window implies that \(1\le t_{max}\le T-1\), or equivalently, \((1+r^{1/2})^{-1}<X{}_{0}<(1+r^{T-3/2})^{-1}\), which form the limits of a uniform distribution.

The inherent growth parameter prior is shown in Eq. (5) and is elicited by fitting to independent data of peak growth, \(m=\max _{t\in \left\{ 0,1,\ldots ,T-2\right\} }(X_{t+1}-X_{t})/K\), which under the CGP model, is related to the inherent growth parameter via \(r=(m+1)^{2}/(m-1)^{2}\). Peak growth estimates come from 2013 ICS 209 records2 of large (\(>404\) ha) wildfires occuring within the continental United States during the years 2002-2013. The beta distribution is fit via maximum-likelihood and adequately describes the peak growth data as confirmed upon visual inspection of the QQ-plot.3

The observation error parameter prior shown in Eq. (6) places finite bounds on the range of possible errors using a uniform distribution. The upper bound represents a scenario where a multiplicative error of a factor of 10 or greater occurs in 1 in 20 observations. For context, the largest overestimate of burn area in a survey of eight relevant ICS 209 reports was by a factor of 1.8, with values less than 1.05 being more common. The lower bound represents a scenario where a multiplicative error of 1-1/1585000 or greater happens occurs in 1 in 20 observations, which suggests the observations of a Yellowstone sized wildfire are largely accurate to within an acre.

Equation (7) describes the final burn area parameter prior, which is a generalized gamma distribution4 fit to transformed burn area (BA) data from the same ICS 209 records as the peak growth prior. The generalized gamma distribution is fit via maximum-likelihood and adequately describes the distribution of the transformed burn area data as confirmed via visual inspection of the QQ-plot. In some cases it is natural to use an unscaled final burn area parameter prior, which equals one with probability one, to represent a scenario in which the observations are simply noisy realizations of the underlying process.

In the NGP model, we require a so-called noise prior to describe the process noise, \(\tau \), which we construct using two extreme scenarios to bound the distribution’s central 90th percentile. To set the lower bound on the noise prior, we consider near identical levels of peak growth, \(m_{1}\) and \(m_{2}\), convert them to \(r_{1}-1\) and \(r_{2}-1\) and find the the multiplicative difference. The 5th percentile of the noise prior then corresponds to a scenario in which the incremental difference of the inherent growth parameter is of that scale or greater occurs with probability 0.002, or about once every 500 time steps. We calculate the upper percentile using a similar process, but with very dissimilar levels of peak growth, representing the scenario where process variability is extremely high. Specifically, the 95th percentile of the noise prior corresponds to a scenario in which the incremental difference in the inherent growth parameter of that scale or greater happens with probability 0.2, or about every 5 time steps. The quantities \(m_{1}=1/1{,}585{,}000\) and \(m_{2}=1/1{,}585{,}001\) are used to calculate the lower bound and \(m_{1}=1/1000\) and \(m_{2}=1/1{,}585{,}001\) for the upper bound, resulting in the final noise prior described in Eq. (8).

3 Application

3.1 Data and computational methods

To illustrate the application of the state space models, we reconstruct wildfire growth curves from 13 incidents from the 2014 wildfire season (Table 1) using \(N=2\) data sources: burn area estimates from GeoMAC wildfire perimeters.5 and cumulative hotspot detects from the Hazard Mapping System (HMS).6

For each incident k, there are two observation vectors of length \(T_{k}\), where \(T_{k}\) is the number of days between the incident’s first and last perimeter plus a six day buffer period to capture information outside the lifetime of GeoMAC measurements. One observation vector is populated with burn area estimates extracted from the “area” feature of GeoMAC perimeters, retaining only the largest perimeter when two exist on the same day. The other observation vector is populated with the percentage of the total HMS hotspot detects occurring within the incident boundary, where we define the incident boundary to be the largest perimeter with an 8-kilometer buffer. Note that the priors in the previous section are fit to data that are independent of those used in this application.

Both the CGP and NGP models are fit using JAGS software with the runjags package in R (R Development Core Team 2008) on a MacBook Pro with a 2.7 GHz Intel Core i7 processor. We first compute an initial Markov chain Monte Carlo (MCMC) using three parallel chains with a nominal sample size of 1000, thinning interval of 100, burn-in period of 10,000 and adaptive phase of 10,000. Convergence is monitored visually and by calculating the potential scale reduction factor of the range of the central 90% of the marginal posteriors (Brooks and Gelman 1998). If neccesary, the simulations are continued in batches of 1000 iterations until the maximum potential scale reduction factor falls below 1.01 (Gelman and Shirley 2011). We assess the relative fit of the reconstruction models by using the expected log Bayes factor, \(E[2\mathrm {ln}(\mathrm {Pr}(\mathbf {x},{\theta }_{NGP}, {\gamma _{\mathrm {1:2}}}|\mathbf {y}_{\mathrm {1:2}}) /\mathrm {Pr}(\mathbf {x}, {\theta }_{CGP},{\gamma }_{\mathrm {1:2}}|\mathbf {y}_{\mathrm {1:2}}))]\) (Kass and Raftery 1995) and assess model quality using the root mean square error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), and mean bias error (MBE) between the GeoMAC observations and the median reconstructed growth curve (Cruz and Alexander 2013).
Table 1

Computational requirements of both NGP and CGP models

Firename

Iterations until convergence (\(\times 1000\))

Iterations per second

Total time

CGP

NGP

CGP

NGP

CGP

NGP

(a) Beaver

3

9

61.1

2.0

49

4517

(b) Big Cougar

1

77

111.3

8.8

9

8715

(c) Buzzard

3

8

129.2

10.2

23

785

(d) Carlton

3

18

70.4

2.5

43

7284

(e) Devils Elbow

2

64

107.1

4.7

19

13,561

(f) Eiler

2

46

96.7

3.9

21

11,655

(g) French

3

36

92.8

3.9

32

9328

(h) Happy Camp

1

6

35.2

1.1

28

5364

(i) Johnson Bar

2

13

33.9

0.6

59

22,659

(j) King

2

20

56.9

1.8

35

11,186

(k) Snag Canyon

2

84

72.8

2.6

27

32,475

(l) Somers

1

5

81.7

6.0

12

833

(m) South Fork

3

23

40.7

0.8

74

27,803

3.2 Computational requirements and fit

The computational requirements of the CGP model are substantially less than the NGP model in terms of the number of Gibbs iterations required, time required and sampling efficiency (Table 1), with even the slowest CGP model reconstruction completing in far less time than any NGP reconstruction. In the NGP model, longer duration incidents require more Gibbs iterations, more time to converge, and fewer Gibbs iterations per unit time compared to shorter duration incidents. In CGP models, longer duration incidents took more time to converge, fewer Gibbs iterations per unit time, but did not require substantially more iterations to converge. In general, applying the CGP model will require at most a few minutes to converge, while the NGP model takes between 13 min to 9 h.

With the exception of the Somers fire, the increased computational requirements of the NGP models are compensated by noticeable improvements in fit to observed data, as evidenced by the log Bayes factor scores (Table 2), goodness-of-fit statistics (Table 3), and graphical comparisons of both models (Fig. 1). While the NGP model provides a superior fit compared to the CGP model, the overall accuracy varies across incidents. For instance, the range of MAPE statistics under the NGP model are quite variable, with high accuracy in the Snag Canyon (10%), King (14%), and Eiler (16%) wildfires, but poor goodness-of-fit in other cases such as Carlton (9109%), South Fork (1532%), and Buzzard (1498%).
Table 2

Log Bayes factor, sample size and GeoMAC burn area

Firename

\(2\ln (BF)\)

\(N_{GeoMAC}\)

\(N_{HMS}\)

\(K_{GeoMAC}\)

(a) Beaver

215.3

25

38

13,139

(b) Big Cougar

27.1

12

18

26,397

(c) Buzzard

20.5

9

19

159,992

(d) Carlton

92.3

21

33

104,731

(e) Devils Elbow

66.8

11

24

9751

(f) Eiler

83.6

15

26

13,426

(g) French

64.0

15

26

5611

(h) Happy Camp

201.2

45

53

54,782

(i) Johnson Bar

247.8

29

69

5383

(j) King

48.0

30

37

39,295

(k) Snag Canyon

149.3

23

31

5169

(l) Somers

0.3\(^\mathrm{a}\)

17

26

14,644

(m) South Fork

39.9

27

58

26,780

\(^\mathrm{a}\)Negligible difference in goodness-of-fit of constant growth parameter and non-constant growth parameter model (Kass and Raftery 1995)

Table 3

Goodness-of-fit summaries

Firename

RMSE

MAE

MAPE

MBE

CGP

NGP

CGP

NGP

CGP

NGP

CGP

NGP

(a) Beaver

1368

306

657

165

202

104

\(-\) 468

146

(b) Big Cougar

7772

7485

3455

2918

720

135

968

1290

(c) Buzzard

62,360

54,474

37,876

28,635

1013

1498

\(-\) 6049

3286

(d) Carlton

32,042

41,422

13,341

34,901

9986

9109

9887

34,828

(e) Devils Elbow

1584

430

914

267

50

97

\(-\) 910

\(-\) 267

(f) Eiler

3381

3373

1163

1112

23

16

679

649

(g) French

1364

1415

513

468

1903

32

374

382

(h) Happy Camp

9973

8611

5446

3282

810

109

\(-\) 2723

538

(i) Johnson Bar

1362

1133

929

671

467

317

\(-\) 266

664

(j) King

1945

2489

954

1372

1501

14

20

892

(k) Snag Canyon

541

266

329

144

25

10

\(-\) 221

0\(^\mathrm{a}\)

(l) Somers

5264

5267

2931

2932

177

174

392

380

(m) South Fork

1758

1971

811

1046

1443

1532

427

814

\(^\mathrm{a}\)MBE is approximately \(-\) 0.5, rounding to zero

Fig. 1

Observed burn area estimates from GeoMAC and the Hazard Mapping System are shown as square and circular points, respectively. The median and central 90th percentile of the marginal posterior distributions of the growth curve estimates are shown under the constant growth parameter (blue) and non-constant growth parameter (red) model. Letters correspond to the wildfire incidents in Table 2

4 Sensitivity analysis

Multiple defensible priors can often be applied to the same problem for reasons such as differences in philosophies regarding their purpose in Bayesian analysis, variation in elicitation techniques, and diverse individual levels of uncertainty (Spiegelhalter et al. 2004). Given that priors are both subjective and influential on the final results, it is valuable to determine how changes to the priors may influence our burn area time series reconstructions. To that end, we perform a sensitivity analysis to gauge how the choice of priors influences the results from the Johnson Bar fire, which we reconstruct multiple times under multiple model configurations.

Five of these configurations exchange the final burn area prior, two of which are distributions based on simple assumptions about the bounds of the parameter, while the remaining three are fit to burn area data. The first distribution is called the lower-truncated inverse-uniform (LTIU) prior, \(404\times K^{-1}\sim \mathrm {U(0,1)}\), which sets no upper limit on final burn area but does set a lower limit of 404 ha. Strict upper and lower limits are imposed using the bounded inverse-uniform (BIU) prior, \(K^{-1}\sim U(6.41\times 10^{-5},404^{-1})\), which constrains the burn area to be between 404 ha and the size of the 1988 Yellowstone wildfires. The remaining final burn area parameter priors are based on burn area data from the monitoring trends in burn severity project7 dataset, which includes 9050 fires larger than 404 ha across the conterminous U.S. and from the years 1984–2010. The lower-truncated inverse-beta (LTIB) prior, \(404\times K^{-1}\sim \mathrm {Beta}(0.964,1.152)\) and the bounded inverse-beta (BIB) prior \(404\times (K^{-1}-6.41\times 10^{-5})/(1/404-6.41\times 10^{-5}) \sim \mathrm {Beta}(0.958,1.149)\), are maximum-likelihood estimates of the transformed burn area data and similarly, the Pareto prior, \(K\sim \mathrm {Pareto}(404,1.13)\), is fit to the untransformed data.

Four of these configurations exchange the original peak growth prior with an alternative distribution. The four alternatives are called the uniform (U), mode-zero triangle (MZT), mode-one triangle (MOT) and arcsine priors, each being a special case of the beta distribution: \(m_{U}~\sim \mathrm {Beta}(1,1),\) \(m_{MZT}\sim \mathrm {Beta}(1,2)\) , \(m_{MOT}\sim \mathrm {Beta}(2,1)\), and \(m_{Arcsine}\sim \mathrm {Beta}(0.5,0.5)\) (Fig. 2).
Fig. 2

Boxplots of each model parameter’s marginal posterior for each reconstruction model and fire. The left shows process component parameters incuding the inherent growth parameter (top); initial conditions (middle) and the process noise (bottom). The right shows the observation component parameters including final burn area (top) and the observation error (bottom). Outliers are omitted. Letters correspond to fires in Table 2

We also propose three new configurations for observation error priors and noise priors, which we elicit using slight variations of the techniques used to form the original distributions. We originally construct the observation error parameter prior by finding the uniform distribution with endpoints corresponding to specific error exceedance scenarios. We propose alternative priors for the observation error parameter by repeating this process using different exceedance probabilities for the extreme observation error variability scenario, generating a new upper-bound on the uniform distribution. Specifically, the upper-truncation points are recalculated under small (0.01) medium (0.1) and high (0.5) exceedance probabilities, resulting in the alternative upper limits of 0.89, 1.40 and 3.41. Similar to the elicitation of new observation error priors, the alternative noise priors assume that the central 90th percentile of the original distribution represents a different amount of probability mass. Specifically, in our three alternative noise priors, we reinterpret the original bounds as the central 99th, 95th and 50th percentile of a lognormal distribution: \(\tau _{small}\sim \mathrm {Lognormal}(-6.827,3.330)\), \(\tau _{middle}\sim \mathrm {Lognormal}(-6.827,4.376)\), and \(\tau _{large}\sim \mathrm {Lognormal}(-6.827,12.716)\). Depending on the reconstruction model, we reanalyze the Johnson Bar fire using either 12 or 15 new configurations, where each new configuration is identical to the original, except that the prior for one of the five parameters is exchanged with an alternative. All sensitivity analysis computations use the same hardware and software as Sect. 3.

The computational requirements are largely invariant to the majority of substitutions, with the exception of the MOT prior in the NGP model, which converges in about 20–30% of the iterations and time compared to the original configuration. For the CGP model, the largest response occurs in the final burn area parameter when substituting the LTIU distribution, increasing the posterior mean by 1%. In the NGP model, the largest response occurs in the noise parameter when substituting the MOT distribution, reducing the posterior mean by 2% (Fig. 3). The Johnson bar growth curve reconstructions are fairly robust changes in the priors, with the most dramatic responses occurring early in the time series, where little or no data are available. The largest change to the growth curve is associated with substituting the arcsine distribution, which inflates the initial condition parameter by about 1%. The sensitivity to prior substitutions decreased shortly after the ignition, reducing to near-zero levels in the last days of the reconstruction (Fig. 4).
Fig. 3

Boxplots of four model parameter’s marginal posterior with the original and new prior configurations, under the non-constant growth parameter (left panels) and constant growth parameter (right panels) models. The x-axis labels denote the type of prior substitution

Fig. 4

The ratio of the non-constant growth parameter model’s median burn area time series reconstruction under the original (Fig. 1j) and alternative prior configurations. Values greater than one represent burn area estimates that are larger in the new prior configuration than in the original

5 Conclusions

Our two reconstruction models represent novel methods of improving the quality of burn area time series and have a number of desirable features. Our growth model generates reconstructions that have behavior consistent with the underlying process: non-decreasing, non-negative, and sigmoidal. The priors incorporate additional information from independent historical growth records, providing the reconstructions with a guide of typical wildfire growth curve characteristics and uncertainty. These state space models are also attractive because they incorporate multiple data sources, reflecting the uncertainty in the growth curve and also permitting information borrowing when needed. They are easily modifiable to accommodate a variety of other data generating and growth processes beyond those explored here. The state space approach is also ideal because the model output is fairly easy to interpret as the probability distribution of likely growth curves given the observed data and known fire behavior.

The NGP model is particularly well-suited for capturing daily variation in fire growth, but the computational requirements are much higher than in the CGP model. If the computational requirements are too burdensome in a given application, a couple of mitigation strategies are available. For instance, the marginal convergence tends to be slower towards the end of the time series and the introduction of an extinguishment parameter would lessen parameter space redundancy and by extension could increase speed. The use of better hardware and more efficient sampling algorithms, like Hamiltonian MCMC, are also obvious strategies for reducing this burden (Neal 2011). In some cases, such as with short duration fires, the NGP model may not be needed and strategic use of the CGP model can substantially reduce overall time and computational requirements.

These reconstruction models and variations of them, have a number of potential applications in fire management and research. A wildfire growth curve database could be organized using these reconstruction models, where all available data are aggregated to produce high quality growth curve estimates with uncertainty. By relaxing the constraints on the initial conditions as to allow peak growth events beyond the range of observations, the reconstruction models can be modified to simulate the future growth of not yet extinguished fires. Future iterations of the model could also incorporate environmental covariates into the reconstruction estimates, describe spatially explicit wildfire growth processes and behaviors, and explore how the models can be useful for other non-wildfire applications. Although we explore the sensitivity of the models to perturbations in the prior choice in Sect. 4, the flexibility of the state space models suggests that many other structural changes could also be applied, influencing the parameter estimates in unknown ways. Other model customizations could include changes to the process error structure, the use of additional variables for other behaviors such as extinguishment, changing the observation error structure, adding or omitting data, and using alternative growth models.

In closing, our reconstruction models offer a natural way of integrating prior knowledge and data from multiple sources into a single coherent estimate of the underlying growth curve that includes estimates of associated uncertainties. Data quality issues are handled elegantly, managing missing and erroneous data without discarding potentially useful information. We do not recommend one of the two models over the other, but wish to highlight the unique circumstances in which each approach may be most beneficial. The NGP model is particularly well-suited for describing the daily variation in wildfire growth and improves the quality of the growth curves based on satellite and ground-based observations alone. However, the relatively high computational requirements of the NGP model suggest that the CGP may be appropriate in situations where processing time is a constraint. While further sensitivity analysis is recommended, the results of the Johnson bar fire are robust to a range of prior substitutions, suggesting that under typical applications of this model configuration, prior sensitivity is not likely a serious issue. We recommend that future research explore the potential of these models as data reconciliation tools, as well as how these models may be modified to meet other research and management needs.

Footnotes

  1. 1.
  2. 2.
  3. 3.

    Maximum-likelihood results and QQ-plots are omitted for brevity, but are available upon request.

  4. 4.

    If \(Z\sim \mathrm {Gen.Gamma}(r,\lambda ,b)\), then \(f_{Z}(x)=b\lambda x^{br-1}e^{-(\lambda x){}^{b}}/\varGamma (r)\).

  5. 5.
  6. 6.
  7. 7.

    http://www.mtbs.gov, data acquired April, 2013.

Notes

Acknowledgements

This research was made possible by the financial and educational contributions of the Northwest Climate Science Center fellowship program, for which we are very grateful. We would like to recognize Eli Holmes, Eric Ward and Mark Schuerell for suggesting the use of the state space modelling, and William Chen, Cole Monahan, Kiva Oken, Brian Potter, and the AirFire team for their helpful feedback. We would also like to acknowledge the Pacific Wildland Fire Sciences Laboratory and the USFS Pacific Northwest Research Station for their support. Finally, we are also grateful for the advice provided by the two anonymous reviewers who dramatically improved the presentation of this work.

References

  1. Alexander ME, Thomas DA (2003) Wildland fire behavior case studies and analyses: other examples, methods, reporting standards and some practical advice. Fire Manag Today 63:4–12Google Scholar
  2. Allison RS, Johnston JM, Craig G, Jennings S (2016) Airborne optical and thermal remote sensing for wildfire detection and monitoring. Sensors 16(8):1310CrossRefGoogle Scholar
  3. Andrews P, Finney MA, Fischetti M (2007) Predicting wildfires. Sci Am 297:46–55CrossRefPubMedGoogle Scholar
  4. Berezansky L, Braverman E (2004) On impulsive Beverton–Holt difference equations and their applications. J Differ Equ Appl 10(9):851–868CrossRefGoogle Scholar
  5. Beverton RJ, Holt SV (1957) On the dynamics of exploited fish populations, vol 19. Her Majestys Stationery Office, LondonGoogle Scholar
  6. Birch DS, Morgan P, Kolden CA, Hudak AT, Smith AM (2014) Is proportion burned severely related to daily area burned? Environ Res Lett 9:064011CrossRefGoogle Scholar
  7. Billmire M, French NH, Loboda T, Owen RC, Tyner M (2014) Santa Ana winds and predictors of wildfire progression in southern California. Int J Wildland Fire 23:1119–1129CrossRefGoogle Scholar
  8. Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. J Comput Graph Stat 7:434–455Google Scholar
  9. Callister KE, Griffioen PA, Avitabile SC, Haslem A, Kelly LT, Kenny SA, Bennett AF (2016) Historical maps from modern images: using remote sensing to model and map century-long vegetation change in a fire-prone region. PLoS ONE 11(3):e0150808CrossRefPubMedPubMedCentralGoogle Scholar
  10. Conese C, Bonora L (2005) Burned land mapping from remote sensing imagery. Pecora, HarleysvilleGoogle Scholar
  11. Cruz MG, Alexander ME (2013) Uncertainty associated with model predictions of surface and crown fire rates of spread. Environ Model Softw 47:16–28CrossRefGoogle Scholar
  12. De La Sen M (2008) The generalized Beverton–Holt equation and the control of populations. Appl Math Model 32:2312–2328CrossRefGoogle Scholar
  13. De Longueville B, Smith RS, Luraschi G (2009) Omg, from here, i can see the flames!: a use case of mining location based social networks to acquire spatio-temporal data on forest fires. In: Proceedings of the 2009 international workshop on location based social networks. ACM, pp 73–80Google Scholar
  14. Finney MA, Grenfell IC, McHugh CW (2009) Modeling containment of large wildfires using generalized linear mixed-model analysis. For Sci 55:249–255Google Scholar
  15. Gelman A, Shirley K (2011) Inference from simulations and monitoring convergence. In: Handbook of markov chain monte carlo. CRC Press, Taylor & Francis, Boca Raton, pp 162–174Google Scholar
  16. Godsill SJ, Doucet A, West M (2004) Monte Carlo smoothing for nonlinear time series. J Am Stat Assoc.  https://doi.org/10.1198/016214504000000151 CrossRefGoogle Scholar
  17. Hirsch SN (1965) Airborne infrared mapping of forest fires. Fire Technol 1(4):288–294CrossRefGoogle Scholar
  18. Johnston P, Milne G, Kelso J (2006) A heat transfer simulation model for wildfire spread. For Ecol Manag 234(1):S78CrossRefGoogle Scholar
  19. Joyce KE, Belliss SE, Samsonov SV, McNeill SJ, Glassey PJ (2009) A review of the status of satellite remote sensing and image processing techniques for mapping natural hazards and disasters. Prog Phys Geogr 33(2):183–207CrossRefGoogle Scholar
  20. Kasischke ES, Williams D, Barry D (2002) Analysis of the patterns of large fires in the boreal forest region of Alaska. Int J Wildland Fire 11:131–144CrossRefGoogle Scholar
  21. Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795CrossRefGoogle Scholar
  22. Kolden CA, Lutz JA, Key CH, Kane JT, Van Wagtendonk JW (2012) Mapped versus actual burned area within wildfire perimeters: characterizing the unburned. For Ecol Manag 286:38–47.  https://doi.org/10.1016/J.FORECO.2012.08.020 CrossRefGoogle Scholar
  23. Lavoué D, Gong S, Stocks BJ (2008) Modelling emissions from Canadian wildfires: a case study of the 2002 Quebec fires. Int J Wildland Fire 16(6):649–663CrossRefGoogle Scholar
  24. Mangeon S, Field R, Fromm M, McHugh C, Voulgarakis A (2015) Satellite versus ground-based estimates of burned area: a comparison between MODIS based burned area and fire agency reports over North America in 2007. Anthr Rev.  https://doi.org/10.1177/2053019615588790 CrossRefGoogle Scholar
  25. Magnani M, Montesi D (2010) A survey on uncertainty management in data integration. J Data Inf Qual (JDIQ) 2(1):5Google Scholar
  26. McNamara D, Stephens G, Ruminski M, Kasheta T (2004) The hazard system (HMS)—NOAA multi-sensor fire and smoke detection program using environmental satellites. In: 13th conference on satellite meteorology and oceanography, vol 22Google Scholar
  27. Moeltner K, Kim MK, Zhu E, Yang W (2013) Wildfire smoke and health impacts: a closer look at fire attributes and their marginal effects. J Environ Econ Manag 66(3):476–496CrossRefGoogle Scholar
  28. Neal RM (2011) MCMC using Hamiltonian dynamics. Handb Markov Chain Mt-Carlo 2:113–162Google Scholar
  29. Parks SA (2014) Mapping day-of-burning with coarse-resolution satellite fire-detection data. Int J Wildland Fire 23:215–223CrossRefGoogle Scholar
  30. Petersen KG (2014) Mapping a wildfire: mapping practices, authoritative knowledge and the unpredictable nature of disaster. University of California, San DiegoGoogle Scholar
  31. R Development Core Team (2008) R: a language and environment for statistical computing. http://www.R-project.org
  32. Sparks AM, Boschetti L, Smith AM, Tinkham WT, Lannom KO, Newingham BA (2015) An accuracy assessment of the MTBS burned area product for shrub–steppe fires in the northern Great Basin, United States. Int J Wildland Fire 24(1):70–78CrossRefGoogle Scholar
  33. Spiegelhalter D, Abrams KR, Myles JP (2004) An overview of the Bayesian Approach. In: Bayesian approaches to clinical trials and health-care evaluation. John Wiley & Sons, Chichester, pp 49–120Google Scholar
  34. Taylor SW, Woolford DG, Dean CB, Martell DL (2013) Wildfire prediction to inform fire management: statistical science challenges. Stat Sci 28:586–615CrossRefGoogle Scholar
  35. Turner MG, Hargrove WW, Gardner RH, Romme WH (1994) Effects of fire on landscape heterogeneity in Yellowstone National Park, Wyoming. J Veg Sci 5:731–742CrossRefGoogle Scholar
  36. Turetsky MR, Amiro BD, Bosch E, Bhatti JS (2004) Historical burn area in western Canadian peatlands and its relationship to fire weather indices. Glob Biogeochem Cycles.  https://doi.org/10.1029/2004GB002222 CrossRefGoogle Scholar
  37. Turquety S, Logan JA, Jacob DJ, Hudman RC, Leung FY, Heald CL (2007) Inventory of boreal fire emissions for North America in 2004: importance of peat burning and pyroconvective injection. J Geophys Res Atmos.  https://doi.org/10.1029/2006JD007281 CrossRefGoogle Scholar
  38. Veraverbeke S, Sedano F, Hook SJ, Randerson JT, Jin Y, Rogers BM (2014) Mapping the daily progression of large wildland fires using MODIS active fire data. Int J Wildland Fire 23:655–667CrossRefGoogle Scholar
  39. Veraverbeke S, Rogers BM, Randerson JT (2015) Daily burned area and carbon emissions from boreal fires in Alaska. Biogeosciences.  https://doi.org/10.5194/bg-12-3579-2015 CrossRefGoogle Scholar
  40. Zhang X, Kondragunta S, Quayle B (2011) Estimation of biomass burned areas using multiple-satellite-observed active fires. IEEE Trans Geosci Remote Sens 49(11):4469–4482CrossRefGoogle Scholar

Copyright information

© The Author(s) 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.University of WashingtonSeattleUSA
  2. 2.Norwegian Computing CenterOsloNorway
  3. 3.Pacific Wildland Fire Sciences Laboratory, US Forest ServiceSeattleUSA

Personalised recommendations