1 Introduction

Forest inventory and monitoring programs report estimates of parameters related to forest area and biomass using data acquired from arrays of ground plots. Although completely valid inferences can be constructed using only the ground data, the resulting precision may be less than acceptable, particularly for highly variable populations and for regions for which sampling intensities are small due to cost and logistical constraints. Remotely sensed auxiliary data, often in the form of forest attribute maps, have the potential to increase the precision of estimates with no increase in sample size.

Although the utility of maps based on remotely sensed auxiliary information for enhancing inference is well-documented (GOFC-GOLD 2014; GFOI 2013), acquisition and processing of the remotely sensed data for large regions may be expensive, labor-intensive, and time-consuming. For example, the Forest Inventory and Analysis (FIA) program of the US Forest Service conducts the nation’s national forest inventory (NFI) and reports inferences for parameters related to forest area and biomass at 5-year intervals. Because acquisition of nationwide remotely sensed data and construction of the necessary maps are beyond the scope of the program’s capabilities, particularly at 5-year intervals, the Landsat-based National Land Cover Dataset (NLCD) (Vogelmann et al. 2001; Homer et al. 2004, 2007) constructed by the US Geological Survey is used. However, the NLCD dates do not necessarily coincide with the FIA reporting dates, and the NLCD has been updated at only approximately 10-year intervals. As a second example, the utility of lidar-assisted approaches for estimating forest biomass is increasingly reported (e.g., d’Oliveira et al. 2012). However, for many countries with tropical forests, the cost of even a single set of lidar data, whether acquired wall-to-wall or in strips, may be prohibitive. The cost of multiple sets of lidar data corresponding to periodic remeasurements of the ground plots would be even more prohibitive.

In addition to the cost factors, estimation procedures based on aggregating map unit data, often characterized as pixel counting, are inherently biased because of map classification and prediction errors. Further, map accuracy indices produce no direct estimates of bias or variances. A popular emerging approach that produces these estimates while simultaneously compensating for the effects of both outdated maps and map errors is to combine map estimates with ground data using the design-based, model-assisted regression estimator (Baffetta et al. 2009; Gregoire et al. 2011; McRoberts 2010, 2011; d’Oliveira et al. 2012; McRoberts and Walters 2012; McRoberts et al. 2013; Næsset et al. 2011, 2013a, 2013b; Vibrans et al. 2013; Sannier et al. 2014), also characterized as the generalized regression estimator (GREG) (Särndal 2011). This estimator adjusts map-based estimates for classification and prediction errors due to factors such as deviations between the dates of the remotely sensed data and the ground reference data (Sect. 2.3.3). Although this feature makes the estimator unbiased, or at least nearly unbiased, the trade-off for adjustment of greater map errors is less precise estimates. Nevertheless, if the map errors are not substantial, the estimator may still be more precise than simple random sampling estimators that use only the ground reference data. Other than Næsset et al. (2011) who comment that fitted models do not compensate for the effects of temporal deviations between response and predictor variables, few reports have been published regarding the degree to which compensation is possible for substantially outdated maps or the degree to which precision is affected.

The overall objective of the study was to assess the degree to which temporal differences between remote sensing-based maps and ground data affect bias and precision estimates for estimates of inventory parameters. For a study area in Minnesota, USA, the population parameter of interest was proportion forest area for which a Landsat-based map of the probability of forest cover was used to enhance estimation. For a study area in Våler, Norway, the population parameter of interest was mean biomass per unit area for which a biomass map based on airborne laser scanning (ALS) data was used to enhance estimation. The rationale for these choices was twofold. First, forest area and volume-related variables such as biomass are the two most important and commonly reported forest inventory and monitoring variables. Second, these study areas, response variables, and auxiliary data provide a diverse context for the study.

2 Materials and methods

2.1 Study areas

2.1.1 Minnesota study area

The study area was defined by the portion of the row 27, path 27, Landsat scene in northeastern Minnesota, USA, that was cloud-free for the two image dates, 16 July 2002 and 30 July 2007 (Fig. 1). The Landsat Thematic Mapper (TM) spectral data were transformed using the normalized difference vegetation index (NDVI) transformation (Rouse et al. 1973) and the three tasseled cap transformations (TCgreen, TCbright, TCwet) (Kauth and Thomas 1976; Crist and Cicone 1984) for each image. The six original bands of spectral data and the four transformations were used as independent variables when constructing models of the relationship between the ground and remotely sensed data (Sect. 3.1.1).

Fig. 1
figure 1

Study area in northeastern Minnesota, USA. Source: state boundaries - National Atlas of the United States, 2005

Ground training data were obtained for permanent plots established by the FIA program using a quasi-systematic sampling design that is regarded as producing an equal probability sample (McRoberts et al. 2010). Each FIA plot consists of four 7.32-m (24-ft) radius circular subplots that are configured as a central subplot and three peripheral subplots with centers located at distances of 36.58 m (120 ft) and azimuths of 0°, 120°, and 240° from the center of the central subplot. Centers of forested, partially forested, or previously forested plots are estimated using global positioning system (GPS) receivers, whereas centers of non-forested plots are verified using aerial imagery and digitization methods. Data were available for 238–252 FIA plots measured each year in the interval [2000, 2009]. Plots in the study area are remeasured at 5-year intervals; so, for example, the plots measured in 2000 were remeasured in 2005.

Field crews visually estimate the proportion of each subplot that satisfies the FIA definition of forest land: minimum area of 0.4 ha (1.0 ac), minimum crown cover of 10 %, minimum crown cover width of 36.6 m (120 ft), and forest land use. Field crews also observe species and measure diameter at-breast-height (dbh) (1.37 m, 4.5 ft) and height for all trees with dbh of at least 12.7 cm (5 in). Growing stock volumes are estimated for individual measured trees using statistical models, aggregated at subplot-level, expressed as volume per unit area, and considered to be observations without error (McRoberts and Westfall 2014).

For this study, data for only the central subplot of each plot were used to avoid dealing with spatial correlation among observations for subplots of the same plot. Deletion of data for the remaining subplots resulted in little loss of information, because the correlation among observations for subplots of the same plot was greater than 0.85. Because the 168.3-m2 subplots are considerably smaller than the larger 900-m2 TM pixels, the proportion of a subplot bisected by a forest/non-forest boundary may be considerably different than the proportion of the pixel bisected by the same forest/non-forest boundary. Therefore, such subplots were deleted for purposes of model construction but retained for purposes of estimation. In addition, because FIA field crews classify subplots with respect to land use, not land cover, subplots whose tree cover has been removed are still classified as forest if forest land use is expected to continue. Thus, observations of land cover for subplots with forest land use but no measurable volume were considered to be missing at random and were deleted for purposes of model construction but retained for purposes of estimation. Following deletions, forest/non-forest observations for 186–202 plots per year remained for model construction. For the central subplots, proportion forest was combined with the 10 Landsat variables for pixels containing subplot centers. For future reference, the term plot refers to the central subplot of each FIA plot cluster.

Fig. 2
figure 2

Study area in Våler Municipality in southeastern Norway

The Minnesota study area consists primarily of State and County ownerships that are managed for timber production with rotation cycles on the order of 40 years. However, the study area also includes substantial numbers of private ownerships which may or may not be managed for specific objectives. The dominant forest types are aspen-birch (Populus spp., Betula spp.) and maple-beech-birch (Acer spp., Fagus spp., Betula spp.) with lesser amounts of spruce-fir (Picea spp., Abies spp.).

2.1.2 Våler study area

The 853-ha study area was located in Våler Municipality in southeastern Norway and included 176 systematically distributed, circular, 200-m2 forest inventory plots (Fig. 2). The dominant tree species are Norway spruce (Picea abies (L.) Karst.) and Scots pine (Pinus sylvestris L.). Tree-level aboveground biomass (AGB, Mg/ha) was estimated for both 1999 and 2010 using statistical models based on field observations of species and measurements of dbh (1.3 m) and height (Marklund 1988). For 1999 and 2010, plot-level AGB was estimated as the sum of individual tree AGB predictions, scaled to Mg/ha, and considered to be observations without error (McRoberts and Westfall 2014). Aerial stereo photography was used to delineate four classes related to stand age and species dominance that served as the basis for four strata: (1) recently regenerated forest, (2) young forest, (3) mature, spruce-dominated forest, and (4) mature, pine-dominated forest. Sampling intensities were approximately equal for the first three strata, but for the fourth stratum, the intensity was only approximately one-third of that for the other three strata (Næsset et al. 2013a).

Wall-to-wall ALS data were acquired for the study area in 1999 and 2010. For each year, distributions of first echo heights were constructed for the 200-m2 circular plots and 200-m2 square grid cells that tessellated the study area. A threshold of 1.3 m above the ground surface was used to remove the effects of echoes from ground vegetation whose biomass is not included in tree-level AGB. For each plot and cell, heights corresponding to the 10th, 20th, …, 100th percentiles of the distributions were calculated and were available for inclusion as independent variables for constructing models of the relationship between AGB and the ALS metrics.

2.2 Map construction

2.2.1 Mapping forest/non-forest

The relationship between a dichotomous response variable such as forest/non-forest, here denoted Y (y = 0 denotes non-forest, y = 1 denotes forest), and continuous independent variables, X, is often expressed in the form,

$$ {\mathrm{p}}_{\mathrm{i}}=\mathrm{f}\left({\mathbf{X}}_{\mathrm{i}};\boldsymbol{\upbeta} \right)+{\upvarepsilon}_{\mathrm{i}} $$
(1)

where i indexes population units, pi is the probability that yi = 1, β is a vector of parameters to be estimated, and εi is the random residual with mean 0 (Agresti 2007). The function, f(X i;β), expresses the expectation of Y in terms of X and β and is often formulated using the logistic function leading to the model,

$$ {\mathrm{p}}_{\mathrm{i}}=\frac{ \exp \left({\upbeta}_0+{\displaystyle \sum_{\mathrm{j}=1}^{\mathrm{J}}{\upbeta}_{\mathrm{j}}{\mathrm{x}}_{\mathrm{i}\mathrm{j}}}\right)}{1+ \exp \left({\upbeta}_0+{\displaystyle \sum_{\mathrm{j}=1}^{\mathrm{J}}{\upbeta}_{\mathrm{j}}{\mathrm{x}}_{\mathrm{i}\mathrm{j}}}\right)}+{\upvarepsilon}_{\mathrm{i}} $$
(2)

where j = 1, …, J indexes the independent variables, and exp (.) is the exponential function. The model parameters are estimated using maximum likelihood methods as described by Agresti (2007).

Parameters for the binomial logistic regression model were estimated separately using the 2002 FIA and Landsat data and using the 2007 FIA and Landsat data. A three-step procedure was used to assess quality of fit of the models to the data: (1) all plot observation/model prediction pairs, (yi, \( {\widehat{\mathrm{p}}}_{\mathrm{i}} \)), were ordered with respect to \( {\widehat{\mathrm{p}}}_{\mathrm{i}} \); (2) the ordered pairs were grouped into categories of approximately equal numbers of pairs, and the group means of the plot observations and the corresponding model predictions were calculated; and (3) a graph of the observation means versus the model prediction means was constructed. If the model is correctly specified, a graph of means of observations against means of predictions should lie along the 1:1 line.

2.2.2 Mapping biomass

For the Våler study area, a nonlinear logistic model was used to estimate the relationship between AGB and the ALS metrics. The model had the mathematical form,

$$ {\mathrm{y}}_{\mathrm{i}}=\frac{\upalpha}{1+ \exp \left({\upbeta}_0+{\displaystyle \sum_{\mathrm{j}=1}^{\mathrm{J}}{\upbeta}_{\mathrm{j}}{\mathrm{x}}_{\mathrm{i}\mathrm{j}}}\right)}+{\upvarepsilon}_{\mathrm{i}}, $$
(3)

where i indexes population units, xij is the jth lidar metric, α and the βs are parameters to be estimated, and εi is the residual term. An advantage of the logistic model expressed by Eq. (3) over a linear model is that all predictions are constrained by the lower horizontal asymptote of ŷ = 0 and the upper horizontal asymptote of \( \widehat{\mathrm{y}}=\widehat{\upalpha} \) which is estimated from the sample data. This logistic regression model should not be confused with the binomial logistic regression model described in Sect. 2.2.1.

The model was fit using least squares techniques with the parameters estimated separately for each stratum for each of 1999 and 2010 using the corresponding inventory and ALS data. For each model, the quality of fit of the model to the data was assessed using pseudo-R 2 calculated as

$$ {\mathrm{R}}^{2*}=\frac{{\mathrm{SS}}_{\mathrm{mean}}-{\mathrm{SS}}_{\mathrm{err}}}{{\mathrm{SS}}_{\mathrm{mean}}}, $$
(4)

where SSmean is the sum of squared deviations of the observations around their mean and SSerr is the sum of squared deviations of the observations from their predictions. The same three-step procedure as described in Sect. 2.2.1 was also used, albeit using ŷi rather than \( {\widehat{\mathrm{p}}}_{\mathrm{i}} \).

2.3 Analyses

2.3.1 Assumptions and technical objectives

All analyses were based on three underlying assumptions: (1) a finite population, U, consisting of N units in the form of either square, 900-m2 Landsat pixels for the Minnesota dataset or square 200-m2 grid cells for the Våler dataset; (2) a sample, S, of n population units in the form of the plots; and (3) availability of auxiliary remotely sensed Landsat data for all pixels and ALS data for all lidar cells. In the following sections, the terms population unit, pixel, and grid cell are used interchangeably.

For assessments of forest area, the objective is typically to estimate the area for a class of the response variable. Because the estimate of class area is simply the product of total area which is usually known and the estimate of the class area proportion, the parameter of interest for the Minnesota portion of the study was proportion forest at time t, denoted μ t. For the Våler study area, the parameter of interest was mean AGB at time t, also denoted μ t. For inventory applications, the ultimate objective is construction of an inference in the form of an approximately 95 % confidence interval for μ t expressed as

$$ {\widehat{\mu}}^{\mathrm{t}}\pm 2\cdot \sqrt{V\widehat{a}r\left({\widehat{\mu}}^t\right)}, $$
(5)

where \( V\widehat{a}r\left({\widehat{\mu}}^t\right) \) is the estimator of the variance of \( {\widehat{\mu}}^t \). Thus, the study emphasis was estimation of μ t and the standard error of its estimate, \( SE\left({\widehat{\mu}}^t\right)=\sqrt{V\widehat{a}r\left({\widehat{\mu}}^t\right)} \).

2.3.2 Simple random sampling estimators

For both study areas and for each year for which ground data were available, the parameters of interest were estimated using the simple random sampling (SRS) estimators,

$$ {\widehat{\mu}}_{\mathrm{SRS}}^{{\mathrm{t}}_{\mathrm{grnd}}}=\frac{1}{\mathrm{n}}{\displaystyle \sum_{\mathrm{i}\in \mathrm{S}}{\mathrm{z}}_{\mathrm{i}}^{{\mathrm{t}}_{\mathrm{grnd}}}} $$
(6a)

and

$$ V\widehat{a}r\left({\widehat{\mu}}_{\mathrm{SRS}}^{{\mathrm{t}}_{\mathrm{grnd}}}\right)=\frac{1}{n\left(n-1\right)}{\displaystyle \sum_{\mathrm{i}\in \mathrm{S}}^{\mathrm{n}}{\left({\mathrm{z}}_{\mathrm{i}}^{{\mathrm{t}}_{\mathrm{grnd}}}-{\widehat{\mu}}_{\mathrm{SRS}}^{{\mathrm{t}}_{\mathrm{grnd}}}\right)}^2}, $$
(6b)

where tgrnd denotes the date of the ground data, z denotes the ground observations of forest or non-forest for the Minnesota study area or AGB for the Våler study area.

2.3.3 Model-assisted regression estimators

Model-assisted regression estimation is an approach to increasing precision that uses auxiliary information. For both study areas and each ground data year, an initial estimate can be calculated as the mean over map predictions, regardless of the year of the map,

$$ {\widehat{\mu}}_{\mathrm{i}\mathrm{nit}}^{{\mathrm{t}}_{\mathrm{map}}}=\frac{1}{\mathrm{N}}{\displaystyle \sum_{\mathrm{i}=1}^{\mathrm{N}}{\widehat{\mathrm{z}}}_{\mathrm{i}}^{{\mathrm{t}}_{\mathrm{map}}}}, $$
(7a)

where tmap denotes the date of the remotely sensed data and \( \widehat{z} \) denotes the prediction of the probability of forest from Eq. (2) or the AGB prediction from Eq. (3). However, this estimator may be biased due to map classification and prediction error for multiple reasons such as changes in the response variable between the map and ground data dates. The bias of this estimator is estimated as

$$ \mathrm{B}\widehat{\mathrm{i}}\mathrm{a}\mathrm{s}\left({\widehat{\mu}}_{\mathrm{i}\mathrm{nit}}^{{\mathrm{t}}_{\mathrm{map}}}\right)=\frac{1}{\mathrm{n}}{\displaystyle \sum_{\mathrm{i}\in \mathrm{S}}\left({\widehat{\mathrm{z}}}_{\mathrm{i}}^{{\mathrm{t}}_{\mathrm{map}}}-{\mathrm{z}}_{\mathrm{i}}^{{\mathrm{t}}_{\mathrm{grnd}}}\right)}. $$
(7b)

The model-assisted, generalized regression estimator (GREG) of the mean is

$$ \begin{array}{c}\hfill {\widehat{\mu}}_{\mathrm{GREG}}^{{\mathrm{t}}_{\mathrm{grnd}}}={\widehat{\mu}}_{\mathrm{i}\mathrm{nit}}^{{\mathrm{t}}_{\mathrm{map}}}-\mathrm{B}\widehat{\mathrm{i}}\mathrm{a}\mathrm{s}\;\left({\widehat{\mu}}_{\mathrm{i}\mathrm{nit}}^{{\mathrm{t}}_{\mathrm{map}}}\right)\hfill \\ {}\hfill =\frac{1}{\mathrm{N}}{\displaystyle \sum_{\mathrm{i}=1}^{\mathrm{N}}{\widehat{\mathrm{z}}}_{\mathrm{i}}^{{\mathrm{t}}_{\mathrm{map}}}}-\frac{1}{\mathrm{n}}{\displaystyle \sum_{\mathrm{i}\in \mathrm{S}}\left({\widehat{\mathrm{z}}}_{\mathrm{i}}^{{\mathrm{t}}_{\mathrm{map}}}-{\mathrm{z}}_{\mathrm{i}}^{{\mathrm{t}}_{\mathrm{grnd}}}\right)}\hfill \end{array} $$
(7c)

with variance estimator,

$$ V\widehat{a}r\left({\widehat{\mu}}_{\mathrm{GREG}}^{{\mathrm{t}}_{\mathrm{grnd}}}\right)=\frac{1}{\mathrm{n}\left(\mathrm{n}-1\right)}{\displaystyle \sum_{\mathrm{i}\in \mathrm{S}}{\left({\upvarepsilon}_{\mathrm{i}}-\overline{\upvarepsilon}\right)}^2}, $$
(7d)

where \( {\varepsilon}_i=\left({\widehat{z}}_i^{t_{map}}-{z}_i^{t_{grnd}}\right) \) and \( \overline{\varepsilon}=\frac{1}{n}{\displaystyle \sum_{i\in S}{\varepsilon}_i} \) (Särndal et al. 1992, Sect. 6.5; Särndal 2011). The potential advantage of the GREG estimators is that \( {\displaystyle \sum_{i\in S}{\left({\varepsilon}_i-\overline{\varepsilon}\right)}^2} \) from Eq. (7d) may be smaller than \( {\displaystyle \sum_{i\in S}\left({\widehat{z}}_i^{t_{grnd}}-{\widehat{\mu}}_{SRS}^{t_{grnd}}\right)} \) from Eq. (6b) in which case \( V\widehat{a}r\left({\widehat{\mu}}_{GREG}^{t_{grnd}}\right) \) should be smaller than \( V\widehat{a}r\left({\widehat{\mu}}_{SRS}^{t_{grnd}}\right) \). The GREG bias and variance estimates are generally expected to be smaller when |t map  − t grnd | is smaller. However, the estimator is still, at worst, nearly unbiased, regardless of the difference in dates.

2.3.4 Stratified estimators

For the Våler study area, the unequal sampling intensities within strata necessitated use of stratified estimators. Because the plots were distributed systematically, the within-strata sample sizes were considered random rather than fixed as would be the case for stratified sampling. Thus, the post-stratified (STR) estimators as provided by Cochran (1977) were used,

$$ {\widehat{\mu}}_{\mathrm{STR}}={\displaystyle \sum_{\mathrm{h}=1}^{\mathrm{H}}{\mathrm{w}}_{\mathrm{h}}\cdot {\widehat{\mu}}_{\mathrm{h}}}, $$
(8a)

and

$$ V\widehat{a}r\left({\widehat{\mu}}_{\mathrm{STR}}\right)={\displaystyle \sum_{\mathrm{h}=1}^{\mathrm{H}}\left[{\mathrm{w}}_{\mathrm{h}}\cdot \frac{{\widehat{\upsigma}}_{\mathrm{h}}^2}{\mathrm{n}}+\left(1-{\mathrm{w}}_{\mathrm{h}}\right)\cdot \frac{{\widehat{\upsigma}}_{\mathrm{h}}^2}{{\mathrm{n}}^2}\right]}, $$
(8b)

where n is the total sample size, h = 1,…,H denote the strata, wh are the strata weights calculated as the proportions of the study area in strata, and the within-strata means and variances, μ h and \( {\widehat{\sigma}}_h^2 \) are estimated using both the SRS and the GREG estimators.

2.3.5 Estimating mean proportion forest and biomass per unit area

For the Minnesota study area, three estimates of proportion forest were calculated for each t grnd  ∈ [2000, 2009]: one using the SRS estimators, one using the GREG estimators and the 2002 map, and one using the GREG estimators and the 2007 map. For the Våler study area, three estimates of mean AGB were calculated for each of 1999 and 2010: one using the SRS estimators within strata, one using the GREG estimators and the 1999 map within strata, and one using the GREG estimators and the 2010 map within strata. Under the assumption of unbiasedness of the estimators, differences in the three estimates for the same ground data year should be small.

3 Results

For the Minnesota study area, the logistic regression models adequately represented the relationships between the probability of forest and Landsat variables (Fig. 3). If \( {\widehat{p}}_i<0.5 \) is used to predict non-forest and \( {\widehat{p}}_i\ge 0.5 \) is used to predict forest, the overall accuracies of the 2002 and 2007 classifications were 0.89 and 0.92, respectively. For the Våler study area, R 2 * values for the eight biomass models, one for each of the four strata for each of the 2 years, were in the range 0.72–0.96 with six of the eight greater than 0.90; these large R 2 * values were reflected in the strong and similar relationships between observations and model predictions for both 1999 and 2010 (e.g., Fig. 4).

Fig. 3
figure 3

Accuracy of logistic regression model predictions for Minnesota study area

Fig. 4
figure 4

Group means of biomass observations versus group means of predictions for stratum 1 and 1999 for Våler study area

For the Minnesota study area, annual estimates of proportion forest for each year in the 2000–2009 period were similar, regardless of the estimation approach (Table 1). The bias estimates were uniformly small, not more than 15 % of the estimates of the means, and had little effect on the GREG estimates. SE estimates were also uniformly small, not more than 4 % of the means, with the GREG estimates slightly smaller than the SRS estimates.

Table 1 Estimates of proportion forest area for Minnesota study area

For the Våler study area, the three population-level estimates of mean AGB were similar for 1999 and also for 2010 (Table 2). With the exception of stratum 2, the within-strata estimates were also similar. When the ground and map years were the same, all within-strata GREG bias estimates were less than 5 % of the estimated means, and the GREG SE estimates for the entire study area were less than 2 % of the estimated means with the latter considerably smaller than the SRS estimates. However, when the ground and map years differed, estimates of bias were considerably larger which, in turn, caused the SE estimates to be larger. Nevertheless, for the 2010 ground data and the 1999 map, the GREG SE estimate was still smaller proportionally by 0.13 than the SRS estimate, but such was not the case for the 1999 ground data and the 2010 map for which the GREG SE estimate was larger proportionally by 0.25 than the SRS estimate.

Table 2 Estimates of mean biomass per unit area (Mg/ha) for the Våler study area

4 Discussion

For the Minnesota study area, the small bias estimates associated with the GREG estimators can be attributed to the accuracy of both the 2002 and 2007 maps and the lack of change over the 2000–2009 interval. One result is the similarity in the annual estimates of proportion forest. The slightly smaller SEs for the GREG estimators than for the SRS estimators can be attributed to the combination of the utility of the auxiliary map data and the effectiveness of the GREG estimators. Despite differences in map and ground data dates of as much as 7 years, no appreciable effect on either estimates of proportion forest area or the precision of estimates as indicated by SEs were discernible.

For the Våler study area, the 1999 population estimates of mean AGB were similar as were the within-strata estimates, except for stratum 2, regardless of the estimation method; likewise, the 2010 estimates were very similar, with the exception of stratum 2, regardless of the estimation method. The reason results were different for stratum 2 is not apparent. When the ground and map years were the same, the smaller GREG estimates of SEs relative to the SRS estimates can be attributed to the utility of the combination of map auxiliary data and the GREG estimators. However, when the ground and map years differed, the greater bias and SE estimates can be attributed to AGB change that is not reflected in the outdated maps. Further, when the ground and map years differed, the utility of the auxiliary map data for increasing precision was greatly diminished, despite the accuracy of the adjusted estimates of the means.

The beneficial features of the GREG estimators are important, particularly for the Våler study area. First, as previously noted, when the ground and map years were the same, the GREG estimates of SEs were much smaller than the SRS estimates. Second, when the ground and map years differed, the bias estimates were large, but the GREG adjustment for them compensated for the fact that the outdated maps did not reflect current ground conditions. Therefore, and perhaps most importantly, the GREG estimates for the entire study area were very similar, regardless of the ground year and map year combination and despite changes in the resource between the 2 years. However, the price to be paid for the large GREG adjustments for estimated bias was much greater SE estimates; in particular, the GREG SE estimate for the combination of the 1999 ground data and the 2010 map was larger than the SRS SE estimate.

5 Conclusions

Three conclusions can be drawn from the study. First, the generalized regression estimators use the auxiliary information in the Landsat-based forest/non-forest maps and the airborne laser scanning-based biomass maps to increase the precision of estimates. This feature of the estimators was confirmed by the smaller standard errors for the generalized regression estimates of mean proportion forest and mean AGB than for the simple random sampling estimates when ground and map years were the same.

Second, the feature of the model-assisted generalized regression estimators that corrects for estimated bias makes the estimator unbiased, or at least nearly unbiased. This feature was illustrated by the similarity in estimates of mean proportion forest for the Minnesota study area and estimates of mean AGB for the Våler study area despite using maps that were outdated by 7 to 11 years. In particular, the corrections for estimated bias produced comparable estimates of population means, regardless of the temporal differences between the ground and map data and regardless of the change in the resource between the ground and map years.

Third, the price to be paid for using outdated maps is loss of precision, particularly when substantial change in the response variable occurs between the map and ground data dates. For the Minnesota study area for which change was rare, differences in dates by as much as 7 years had only negligible effects on both bias estimates and precision. However, for the Våler study area for which change was more substantial, differences in dates by 11 years had detrimental effects on precision to the extent that in one case the simple random sampling estimates were more precise than the generalized regression estimates.

Although broad generalizations based on these two study areas are ill-advised, several generalizations are still possible: (1) despite relatively large temporal differences between map and ground data dates and substantial change in the response variable, the adjustment for estimated bias produced similar estimates of population means; (2) the crucial factor affecting precision is not necessarily the temporal difference between map and ground data dates but rather the degree of change in the response variable.