Introduction

Ecosystem science increasingly relies on the use of highly derived metrics to synthesize across large datasets (Pereira and others 2013). For example, the valuation of ecosystem services requires integration of data on ecosystem function (mechanisms, fluxes and pools), land use (maps, classifications and area estimates) and economic or social estimates of the value provided by that service (Costanza and others 1997). Misrepresentation of uncertainty in derived metrics can lead to false assessment of significance and biased results. For example, Phillips and others (1998) analysed long-term plot data and reported that tropical forests were a net carbon sink; however, re-analysis by Clark (2002) showed that this result was biased by ‘artefacts’ associated with measurement of buttressed trees. It is, therefore, important for researchers to have quantitative estimates of the uncertainty associated with the derived metrics (Chave and others 2004; Yanai and others 2010; Butt and others 2013). Uncertainty arises from the inability to perfectly measure key variables, the necessary use of models to make predictions and the natural variability of ecosystem processes across the landscape (Bolker 2008). Although some element of sampling uncertainty is usually reported (that is, among-plot variability in the derived metric), other sources such as measurement error and model uncertainty are generally not incorporated (Clark and Kellner 2012; Muller-Landau and others 2013). It is essential to show the correct level of uncertainty in the derived metrics so that management implications and policy decisions can be assessed with the appropriate level of confidence. Understanding the major determinants of uncertainty can also be a powerful tool for improving methodology and the accuracy of the resulting estimates (for example, Baker and others 2004).

Plot-based estimates of forest carbon stocks and carbon fluxes are derived metrics that contain multiple sources of uncertainty (Phillips and others 1998; Chave and others 2008; Lewis and others 2009). Calculations of forest carbon stock are usually based on plot-based field measurements of stem diameter and (occasionally) stem height. These data are subject to measurement error. The imperfect measurements are transformed into stem biomass estimates, using models—introducing model uncertainty (Chave and others 2004). These include height–diameter models to predict tree height and carbon biomass models to predict carbon stock as a function of diameter, height and wood density (Coomes and others 2002; Chave and others 2005). Finally, biomass is summed across all stems in the plot and divided by the plot area to give total carbon stock estimated on a per-area basis. This step introduces a second element of measurement error relating to missing or double-counted stems and the ability to accurately measure plot area in steep and undulating terrain (Abella and others 2004; Wright 2005). Averaging across a number of plots also introduces sampling error that depends on the number of plots in the sample and the heterogeneity of the landscape (Salk and others 2013). Failure to properly account for all these sources of uncertainty is likely to result in confidence estimates that are too narrow (overoptimistic) with significant implications for carbon accounting and greenhouse gas reporting, carbon trading, and the ability to measure net changes in carbon due to management intervention (Gibbs and others 2007; Peltzer and others 2010; Holdaway and others 2012; Pelletier and others 2012).

Relatively few studies to date have quantified the measurement error or model uncertainty associated with the estimates of forest biomass. In one of the more comprehensive studies, Chave and others (2004) assessed the effects of measurement error (stem diameter), model uncertainty associated with height–diameter relationships, and sampling uncertainty on estimates of tropical carbon stock in Panama. They reported that the uncertainty (standard deviation) in the aboveground biomass for individual trees averaged 47% of the estimate, with 31% arising from uncertainty in the allometric model and 16% from measurement error. At the stand level, however, the effect of measurement error was reduced to less than 1%, and the total uncertainty reduced to 20:10% due to allometric uncertainty and 10% due to sampling uncertainty. In another study, Djomo and others (2011) propagated uncertainty in carbon stock estimates in tropical forest in Cameroon using the statistical propagation techniques described in Chave and others (2004), and reported that uncertainty in allometric equations contributed 30% of the total uncertainty in carbon stock estimates. These estimates may have overestimated the uncertainty due to allometric models (Yanai and others 2010). Another limitation of these studies is that they have focused on tropical forests. Our study is one of the first to test these concepts in temperate forest systems.

Previous studies have tended to focus on uncertainty in carbon stock estimates, rather than uncertainty in carbon change over time. Carbon change is arguably the most important of the two metrics as it is the basis for United Nations Framework Convention on Climate Change (UNFCCC) reporting, including programs such as REDD+ (Pelletier and others 2012). We expected that model uncertainty is likely to be less important for carbon change estimates provided that the same allometric equations are used to calculate carbon stocks at both time periods (Chave and others 2004). In contrast, measurement errors, such as in stem diameter and missing stems, are likely to be more significant for estimates of change based on repeated measures (for example, Wright 2005). Muller-Landau and others (2013) looked at the effects of measurement error (stem diameter only) and the uncertainty introduced by data-cleaning routines on the capacity to detect change in biomass carbon pools. They showed that both measurement errors and data-cleaning routines can introduce systematic errors, and that data-cleaning errors were larger. There are very few studies that have looked at the cumulative effects of both measurement error and model uncertainty on estimates of carbon stock and carbon change, and, therefore, the ability to assess the relative importance of the various potential sources of uncertainty is limited (Pelletier and others 2012).

Here, we develop quantitative statistical methods for propagating uncertainty in plot-based estimates of carbon stock and carbon change in temperate forests and describe the relative effects of measurement error, model uncertainty and sampling uncertainty. Using the New Zealand Land Use and Carbon Analysis System (LUCAS) natural forest plot network (MfE 2010) and associated methods (Payton and others 2004) we quantify the measurement error associated with tree-level data (stem diameter, tree height and species identification) and plot-level data (number of stems, plot area and volume of coarse woody debris) collected under normal field conditions using standard plot-based carbon monitoring methodologies. We also quantify the uncertainty associated with models used to estimate tree-level biomass, including height–diameter allometry, stem volume and wood density. We use these data to conduct a sensitivity analysis to identify the most important sources of uncertainty for estimates of stand-level carbon stock and carbon change, and illustrate the effects that failing to account for these sources of uncertainty could have on national estimates of forest carbon stock and carbon change.

Methods

Field Protocol for Carbon Estimation in New Zealand Forests

We used standard methods for measuring carbon stocks in natural forests developed for LUCAS (Coomes and others 2002; Allen and others 2003; Payton and others 2004). LUCAS monitors carbon stocks in New Zealand’s natural forest to meet New Zealand’s international reporting requirements under UNFCCC and the Kyoto Protocol and follows the Intergovernmental Panel on Climate Change (IPCC) good practice guidance (MfE 2010). The LUCAS natural forest plot network is based on 0.04-ha (20 × 20 m) plots located on an 8-km grid (with a random origin) projected across New Zealand, sampling 1,372 grid intersections where land use was classified as indigenous forest or shrubland according to the New Zealand Land Cover Database version 1 (LCDB1). Permanent carbon monitoring plots were established on 1,256 (92%) of these grid intersections during 2002–2007, and a random subset of these plots were remeasured in 2009–2010. All live stems at least 2.5-cm diameter (D) at 1.35 m were tagged, identified to species level, and D measured using a diameter tape. Diameter of standing dead stems (≥10 cm D) was also measured and these stems scored with ordinal decay class (0–4). Height was measured on a subset of live stems (and all standing dead stems and tree ferns) on each plot, using a vertex hypsometer (Haglöf, Sweden) or 8-m metal tape. Length, two orthogonal widths at each end, and decay class were recorded for all fallen coarse woody debris (CWD, defined as fallen deadwood ≥10 cm in diameter). Lengths and angles of each side of the plot were measured using a vertex hypsometer and a sighting compass to allow calculation of slope-corrected plot area. Full field methods are described in Payton and others (2004) and MfE (2012).

Field Quantification of Measurement Error

In March 2012 we measured seven existing 20 × 20 m LUCAS natural forest plots three times using independent field teams following the standard LUCAS field protocols described above. Plots were located in the central North Island of New Zealand, and were selected to encompass a broad range of temperate broadleaved forest types and stem densities (summary descriptions of plots provided in Appendix Table 1 of Supplementary Material). Each field team comprised four people and included at least one skilled botanist familiar with the local species and two people with reasonable (>5 years) field experience. Plots typically took 1 day to complete, and to represent standard field conditions and time expectations each team had a 10-day period in which to measure all the seven plots. Variation among teams, therefore, reflected typical measurement error expected from experienced field teams under standard field conditions (with, for example, weather and time constraints). All field teams had the same information prior to arriving at the plot (for example, plot sheets and species lists from previous measurements) and used the same field manual. All field staff undertook additional training prior to fieldwork to standardize interpretation of the field manual. Care was taken to minimize disturbance on the plot and no communication among teams occurred during the measurement period. Individual stems for which species identification was uncertain in the field were collected and identified by independent expert botanists for each team.

Statistical Analysis of Measurement Error

We report the variance in measurements for seven different sources of measurement error, of which three were estimated at the tree level (stem diameter, tree height and species identification) and four at the plot level (plot area, number of live stems, number of standing CWD and total volume of fallen CWD). For stem diameter and tree height, we modelled the coefficient of variation (CV) among teams using a log-normal distribution (Appendix 2—Figures 1 and 2). This distribution was chosen visually based on quantile plots of the residuals versus fitted values (Appendix 2—Figures 3 and 4). We obtained an estimate of the ability of teams to correctly classify each species by calculating the proportion of overall agreement between two teams from the species contingency table (Everitt 1992), averaged over all pairwise combinations of teams (team 1–team 2, team 2–team 3, team 1–team 3), with the species-specific results shown in Appendix 2—Figure 5. This method assumes that teams’ species classification performance is independent of species, and that team pairwise comparisons are independent. We modelled all the plot-level measurement errors using a normally distributed CV.

Carbon Calculations

We calculated total aboveground carbon (live stems and deadwood) following species-specific equations from Beets and others (2012). Other pools (litter, roots and soil carbon) were not included in our analysis. For live trees, we calculated stem carbon (Clive; kgC) using an allometric function that incorporates diameter (D; cm), height (H; m) and species-specific wood density (W; kg m−3):

$$ C_{\text{live}} = 0.5 \times 0.905 \times W \times V_{\text{stem}} + {\text{C}}_{\text{branch}} + {\text{C}}_{\text{foliage}} + \varepsilon_{{{\text{C}}_{\text{live}} }} , $$
(1)

where 0.5 represents the carbon fraction of wood, 0.905 accounts for the lower wood density of the bark fraction, \( \varepsilon_{{{\text{C}}_{\text{live}} }} \) is the model uncertainty; and stem volume (V stem), branch carbon (Cbranch) and foliage carbon (Cfoliage) are:

$$ V_{\text{stem}} = 0.0000483\left( {D^{2} H} \right)^{0.978} $$
(2)
$$ {\text{C}}_{\text{branch}} = 0.0175D^{2.20} $$
(3)
$$ {\text{C}}_{\text{foliage}} = 0.0171D^{1.75} . $$
(4)

Equations (1)–(4) are based on pooled data from 143 harvested stems of 15 species. Measured tree heights are typically not available for 75–80% of the live stems. For these, we used species-specific allometric equations to predict tree height (H, m), based on the functional form described in Coomes and others (2012):

$$ ln\left( {H - 1.35} \right) = \ln \left( a \right) + \ln \left( {1 - bA} \right) + { \ln }[1 - \exp \left( {cD^{d} } \right)] + \varepsilon_{H} , $$
(5)

where D is stem diameter (cm); A is the normalized elevation (elevation (m a.s.l.)/100) of the plot scaled to be similar in range to the other predictors; and a, b, c and d are model parameters. Height models were based on a database containing over 64,000 records for 234 species, and were fitted individually for each species (Appendix 1 of Supplementary Material) using the non-linear least squares (nls) function in R (R Development Core Team 2010). Species-specific wood density values were available for 113 species (Richardson and others unpublished data). For species without wood density values, we used the corresponding genus-level average (Flores and Coomes 2011), and where that was unavailable we used the growth-form average. A separate allometric function was used to estimate tree fern biomass as a function of measured diameter and height, based on a sample of 80 stems from four species (Beets and others 2012):

$$ C_{tf} = 0.0027(D^{2} H)^{1.19} . $$
(6)

For standing dead stems (standing CWD), we estimated carbon using New Zealand tree-specific volume and taper equations (Beets and others 2012). First, stem volume (V stem) of an equivalent intact live stem was estimated from diameter (D, cm) and expected total height (H, m) as:

$$ V_{\text{stem}} = 4.54 \times 10^{ - 5} \left( {D^{1.735} } \right)\left( {\frac{{H^{2} }}{H - 1.3}} \right)^{1.235} . $$
(7)

Then the volume of standing CWD (m3) was estimated based on the actual measured height (H dead) of the CWD spar, using the following taper equation:

$$ V_{\text{spar}} = V_{\text{stem}} \left( {1 - 0.06501x^{2} - 2.92127x^{3} + 3.37103x^{4} - 1.35551x^{5} - 0.02924x^{81} } \right), $$
(8)

where x = (H − H dead)/H. We adjusted the proportion of biomass remaining for standing dead stems for decay class using the general decay sequence of 82, 66 and 47% for decay classes 1, 2 and 3, respectively (Beets and others 2008). For fallen CWD, we estimated the volume (m3) of each individual piece using the formula for a truncated cone following Beets and others 2009:

$$ V_{\text{CWD}} = \frac{\pi l}{3}\left[(r_{1}^{2} + r_{2}^{2} ) + (r_{1} \times r_{2} )\right], $$
(9)

where r 1 and r 2 are the radius at each end of the log (m) and l is the log length (m). We calculated the biomass of each piece of fallen CWD as the product of wood volume, wood density and decay class modifier (as described above for the standing dead stems).

Total aboveground carbon stock was the sum of the carbon contained in live trees and CWD (standing and fallen), divided by the slope-corrected area of the plot (A plot, m2). The horizontal area of the plot (ha) was estimated from measurements (m) of four side lengths (AD, MP, DM and PA) as:

$$ A_{\text{plot}} = \frac{{\left( {\left( {AD + MP} \right)/2} \right) \times \left( {(DM + PA)/2} \right)}}{10,000}. $$
(10)

Quantification of Model Uncertainty

We quantified model uncertainty for stem volume [equation (2)] using data from Beets and others (2012). Following Beets and others (2012), we fitted a generalized linear model (GLM) with a gamma error distribution and a log-link function. Diagnostic checks confirmed that this model met the basic GLM assumptions. Since the model residuals were normally distributed, the uncertainty of the model was quantified using the standard error of the mean (SEM), which reflects the standard error of an estimate of the mean of Y (\( \widehat{Y} \)) at a specified value of X (Yanai and others 2010):

$$ {\text{SEM}} = {\text{RSD}}\sqrt {\frac{1}{n} + \frac{{\left( {X - \overline{X} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {X_{i} - \overline{X} } \right)^{2} }}} , $$
(11)

where RSD is the residual standard deviation:

$$ {\text{RSD}} = \sqrt {\frac{{\sum \left( {Y - \widehat{Y}} \right)^{2} }}{n - 2}} . $$
(12)

Equation (11) was chosen over the predictive uncertainty of the model as we were primarily interested in predicting the mean value of the population as opposed to predicting the value for individuals within a population. Predictive uncertainty for individuals is much larger (Yanai and others 2010), and may overestimate uncertainty in plot-scale measurements.

Visual inspection of the residuals of the species-specific height–diameter relationships [equation (5)] showed that although there was more departure from normality than in the volume models, the residuals were approximately normally distributed for the majority of the species. We, therefore, quantified the predictive uncertainty in a similar fashion using the SEM of the fitted models. Uncertainty in wood density estimates was modelled using species-specific estimates of within-species variability in wood density. Species-specific CVs ranged from 0.01 to 0.32. For species without species-specific wood density data (representing approximately 4% of the total biomass) the average CV (8.8%) was used.

Modelling Net Effects on Carbon Stock and Stock Change

We used a Monte Carlo simulation approach to assess the net effects of the main sources of uncertainty on plot-level estimates of carbon stock and carbon stock change (Yanai and others 2012). Simulations were based on data from 227 LUCAS natural forest plots measured first in 2002–2007 and then again in 2009–2010. This subset includes only plots measured using the forest methodology and classified as natural forest according to the New Zealand Land Use Map (LUM2012 v002, sourced from New Zealand Ministry for the Environment, April 2013). Prior to analysis we conducted standard data-checking procedures on the remeasured plot data to ensure that minimum quality standards were met (Wiser and others 2001; Holdaway and others 2013). We took a conservative approach, identifying and correcting only extreme data outliers that can be traced back to clear data-entry mistakes (Muller-Landau and others 2013).

We first calculated carbon stocks and carbon stock change using the standard methods described above, including among-plot sampling error but without including any form of measurement error or model uncertainty. Monte Carlo simulations were then used to test the contribution of various sources of measurement error and model uncertainty to the overall estimates of carbon stock and carbon stock change. For each simulation, measurement errors for stem diameter and (measured) tree height were estimated for each stem by sampling from the observed distributions of measurement error. These error terms were then added to the observed values to give an estimate of the true value. Uncertainty in plot area was modelled in a similar way, by sampling from the measurement error distribution and adjusting the observed plot-level value accordingly. The number of live stems and number of standing CWD spars were estimated from the observed error distributions and the appropriate number of stems was either added to the plot data, assigning species and diameter values by randomly sampling from the existing stems within the plot, or removed from the data by random selection. We assumed that the CV for total fallen CWD volume was the same as the CV for total fallen CWD carbon stocks, and adjusted the fallen CWD carbon stocks directly by sampling from the measurement error distribution for the total fallen CWD volume. This assumption is correct if the missing and double-counted CWD volume comes from a random sample of the species and decay classes present on the plot. Species identification errors were simulated by sampling the measurement error distribution to estimate the number of stems misidentified for each plot. The corresponding number of stems was then selected at random from the plot, and species identification was changed to a random alternative species that was present in the plot and had the same growth form as the original observed value.

Model uncertainty was also propagated for each simulation. Uncertainty in wood density was estimated for each species using the species-specific wood density CV. This uncertainty was applied to both live stem wood and deadwood density prior to adjusting for decay. Model errors associated with stem volume and height–diameter relationships [equations (2) and (5), respectively] were generated by sampling from a normal distribution with a mean of zero and a standard deviation equal to the SEM of the associated fitted model. For the stem volume model, a single (diameter-specific) error term was applied to all the stems in the dataset for each simulation (Yanai and others 2010). This error term was added to the predicted (mean) value to simulate the predictive uncertainty of the model. For the height–diameter relationship, a single (diameter-specific) error term was added per species for each simulation to reflect the use of multiple species-specific height–diameter relationships.

For each scenario, we ran a total of 1,000 simulations and calculated the mean and standard deviation of each simulation, giving a distribution of values in each case. We then used the bootstrap to calculate the median values of both the mean and the standard deviation, and the 95% bias-corrected accelerated percentiles of these distributions. This method provided estimates of uncertainty that incorporated measurement error, model uncertainty, and sampling uncertainty.

We ran a power analysis (Bolker 2008, p. 159) for carbon change to identify the minimum effect size that was detectable across a range of sample sizes and error scenarios. This analysis used a repeated-measures design (paired t test) with a power of 0.8 and significance level of 0.05 and assumed that all the plots were measured using forest methodology described here. All statistical analyses were conducted in R version 2.11 (R Development Core Team 2010).

Results

Measurement Error

Total aboveground carbon stock across all seven repeat measured plots (Appendix Table 1) ranged from 77.6 to 503.7 MgC ha−1 (Figure 1), with the across-plot average being similar for the three teams (repeated-measures ANOVA, F (2,12) = 0.54, P = 0.59). This result indicates that there was no evidence of detectable bias among teams. There were wide confidence intervals around the mean carbon stock estimates for all teams due to high variability in carbon stocks among plots and relatively low sample size (N = 7): 196.9 (95% confidence interval 63.0–330.8) MgC ha−1 for team 1, 200.7 (80.3–321.1) MgC ha−1 for team 2 and 206.9 (60.9–353.0) MgC ha−1 for team 3. Uncertainty in plot-level carbon stocks due to all forms of measurement error, expressed as the breadth of the 95% confidence interval around the mean, averaged 51.8 MgC ha−1 (±standard error of 18.2 MgC ha−1) across plots, and was not related to total carbon stock (linear model F 1,5 = 1.08, P = 0.34). The breadth of the confidence interval was also independent of the total number of stems, the total CWD stock, the portion of biomass in trees >40 cm in diameter, and the mean top height of the plot (P > 0.10 in all cases); but there was marginal evidence (F 1,5 = 5.13, P = 0.073) that uncertainties were positively correlated with the portion of biomass in trees greater than 60 cm.

Figure 1
figure 1

Variability in estimates of live and deadwood (CWD) carbon stock among teams for the seven repeat measured plots. Note the lack of detectable bias among teams.

The CV in tree diameters among teams was log-normally distributed with a mean of −4.554 and standard deviation of 0.829 (Table 1; see Appendix 2—Figure 1 in Supplementary Material for further details). Diameter errors (in cm) increased in proportion to stem diameter (Figure 2A), with 95% of stem measurements being within ±5.3% of the mean diameter value, and this pattern became stronger and non-linear when errors were expressed in units of tree biomass carbon (Figure 2C), due to nonlinearities in the biomass equations [equations (1)–(3)]. Greater uncertainty was observed for tree height measurements, with the CV distributed log-normally with a mean of −3.166 and a standard deviation of 0.836 (Table 1; see Appendix 2—Figure 2). Height errors (m) increased in proportion to stem height (Figure 2B), with 95% of height measurements being ±21.6% of the mean height value. When expressed in units of tree biomass carbon, uncertainty in tree height was strongly non-linear, more so than for stem diameter (Figure 2D), again reflecting nonlinearities in the biomass equations [equations (1)–(3)].

Table 1 Sources of Uncertainty in Carbon Estimates and Their Quantified Distributions
Figure 2
figure 2

Uncertainty associated with stem-diameter and tree-height measurements of individual stems. A and B are in raw measurement units (cm and m, respectively) and are based on the fitted error distribution (Table 1). C and D show these same errors in units of carbon, assuming generic Nothofagus wood density and allometric relationships. Solid line has breadth of 95% confidence interval (that is, difference between 2.5 and 97.5% quantiles), dashed line is breadth of 90% confidence interval, and dotted line is one standard deviation. Error relationships become non-linear when expressed in units of carbon due to nonlinearity in the biomass equation, and this is more pronounced for tree height than for stem diameter.

Tree species were identified consistently between pairs of teams 97.8% of the time on average. The effect of species misidentification on carbon estimates depends on the size of the stem and the difference in wood density values of the misidentified species pair; with misidentifications involving large tree species with large wood density differences between species pairs having the biggest impact on carbon (Appendix 2—Figure 6). Total fallen CWD volume varied by ±39% (95% CI) because of measurement error associated with length and width estimates for fallen CWD and the total number of fallen CWD pieces per plot. The 95% CI in the number of standing CWD spars was ±27%. This CI was much higher than the 95% CI in the number of live stems, which was ±6%. Some of this variability was due to the uncertainty in assessing stem status (alive versus dead). Measurement error associated with plot area (Table 1) had a 95% CI of ±4.6%, or 18.6 m2 per 400-m2 plot.

Model Uncertainty

Model uncertainty in stem volume, tree height and wood density predictions (Table 1) had significant effects on biomass carbon estimates for individual trees (Figure 3). The uncertainty associated with wood density estimates was greater than the uncertainty due to tree height allometry and stem volume allometry combined. The confidence interval breadth associated with the predicted values increased with increasing tree biomass in all cases. For example, the breadth of the 95% quantile for tree biomass of an N. solandri individual of 10-cm diameter was 6.0% due to uncertainty in stem volume, 6.4% due to uncertainty in height–diameter allometry and 17.7% due to uncertainty in wood density. A tree of the same species with a diameter of 50 cm had 7.7, 10.7 and 27.6% uncertainty due to stem volume, tree height and wood density, respectively.

Figure 3
figure 3

Predictions of individual-tree biomass for Nothofagus solandri including uncertainty in stem volume relationship, tree height allometry and wood-density estimate (Table 1). Solid line indicates mean prediction without uncertainty; shaded area represents 95% confidence interval of the predicted values.

Simulated Effects on Plot-Level Carbon Stocks and Carbon Stock Change

The aboveground carbon stock estimates (±sampling uncertainty) for the 227 selected LUCAS natural forest plots were 201.11 ± 18.23 MgC ha−1 for 2002–2007 and 194.99 ± 17.24 MgC ha−1 for 2009–2010. Uncertainty estimates calculated to include propagation of model uncertainty and measurement uncertainty in addition to sampling uncertainty were ±18.42 MgC ha−1 for 2002–2007 and ±17.46 MgC ha−1 for 2009; these uncertainty values are only 1% (0.19 MgC ha−1, 0.09% of the mean stock estimate) larger than those with only sampling uncertainty (Figure 4). This small increase in uncertainty in carbon stocks was attributable mainly to measurement errors, particularly those associated with missed stems, fallen CWD and plot area (Figure 4). Model error had little effect on the overall uncertainty in carbon stock estimates (Figure 4).

Figure 4
figure 4

Modelled net effects of different sources of error (Table 1) on uncertainty estimates for total aboveground carbon stock (2002–2007) and annual net carbon change from a sample of 227 LUCAS natural forest plots. The horizontal axes represent the increase in uncertainty compared with the scenario of sampling uncertainty only, expressed in absolute units (top axes) and as a percentage relative to the total sampling uncertainty (bottom axes). Sampling uncertainty was ±18.23 MgC ha−1 (9.1% of the mean) for carbon stock estimates and ±0.56 MgC ha−1 y−1 (65% of the mean) for carbon change estimates. Error bars are the bootstrapped standard error of the uncertainty estimate. Note the different axes scales.

The uncertainty in carbon change estimates was more sensitive to the measurement uncertainty. The aboveground carbon change estimates (±sampling uncertainty) for the 227 selected LUCAS natural forest plots were −0.86 ± 0.56 MgC ha−1 y−1. Including both model uncertainty and measurement uncertainty increased the 95% confidence interval by 35% to ±0.75 MgC ha−1 y−1. Measurement error was the primary contributor to the overall uncertainty estimates for net carbon change, particularly missed stems, fallen CWD, plot area and tree height measurements (Figure 4). Model uncertainty had no significant effect on uncertainty in carbon change estimates.

The minimum detectable size of carbon change effect (based on a 7-year measurement interval) depended on the error scenario used, with the inclusion of measurement error and model uncertainty increasing the minimum detectable effect size by 35% (Appendix 2—Figure 7, detectable difference increased from 0.38 to 0.51 MgC ha−1 y−1 for N = 1,000). The decrease in detectible effect size obtained by increasing the number of plots sampled was consistent across all the error scenarios.

Discussion

Quantification of Measurement Error

Measurement error is influenced by a range of factors such as tree form and forest structure, field methodology, the skill of the measurer and the field conditions under which the data were collected (Keller and others 2001; Butt and others 2013). We quantified measurement error using realistically well-trained teams under normal field conditions to ensure that our error distributions were representative of actual measurement error in the data used for national-scale carbon analysis. Relatively large measurement errors occurred at the individual tree and plot level, especially for tree height and total CWD volume. Measurement error was not explained by site, team and environmental factors (such as the slope of the plot, weather conditions and the total number of large stems). Tree-height errors were significantly larger than stem-diameter errors, as has been observed in other studies (Phillips and others 2000; Butt and others 2013). The large measurement error associated with estimates of CWD wood volume has not been previously reported, and could have significant implications for understanding CWD dynamics such as the longevity and turnover rates of deadwood (Richardson and others 2009; Fraver and others 2013). Measurement errors increased in proportion to tree size when expressed in the units used for the measurement (for example, cm diameter or m height), but when these errors were propagated through a non-linear model the errors increased non-linearly with tree size (for example, cubically when linear measures are used to model units correlated to volume, like tree carbon). Thus measurement errors are exacerbated by non-linear transformations, especially for large trees (Keller and others 2001). These results suggest that it is very hard to accurately estimate the carbon stock of individual trees or small plots, especially for old-growth forests dominated by large trees.

Our analysis of measurement error uses the average of the three independent measurements as the best estimate of the true value, because, like Djomo and others (2011) and Chave and others (2004), we consider measurement error to be random. Systematic (biased) errors are much harder to quantify, as this requires knowledge of the true value. If systematic biases in DBH or height measurements occur, however, they are unlikely to result in biased carbon stock estimates because the same measurement techniques are generally used to develop the allometric equations that convert these measurements into carbon. Systematic bias is less important for repeated measures, if the methods are consistent for the duration of the study. Systematic changes in accuracy are another source of error. For example, net carbon change may be biased upwards by an increase in observer effort during re-measurement resulting in the inclusion of previously missed stems (that is, false recruitment; Wright 2005). Further work is needed to assess the potential implications of systematic biases, especially for metrics based on repeated measurements through time.

Uncertainty in Carbon Stock Estimates

Our study reveals that for plot-based forest carbon assessment in New Zealand, sampling error is by far the greatest source of uncertainty, with the inclusion of measurement error and model uncertainty results in a 1% increase in the uncertainty associated with carbon stock estimates. In other words, uncertainty in carbon stock estimates is dominated by natural variability in carbon stocks across the landscape. This result is to be expected for national-scale surveys that encompass a range of forest types (Wiser and others 2011), and similar results have been found elsewhere. Sampling uncertainty contributed up to 98% of the total uncertainty in carbon stock estimates in the south-eastern USA (Phillips and others 2000). In contrast, in a single 50-ha plot located within relatively uniform forest in Panama, Chave and others (2004) found that sampling uncertainty contributed only 50% to the total uncertainty in carbon stocks. Sampling uncertainty may be reduced through stratification, increasing plot size or increasing the total number of plots sampled (Salk and others 2013). Phillips and others (2000) showed that on per-hectare basis, it is more efficient to increase the number of sample plots rather than plot size, and this result is backed up by recent analyses of techniques for field-based sampling of biomass (Salk and others 2013).

Model uncertainty contributed relatively little (<0.1%) to the total uncertainty in carbon stock estimates in our study. This result contrasts with reports that model uncertainty accounted for 10–30% of the total uncertainty in carbon stocks in tropical rainforests in Panama (Chave and others 2004) and Cameroon (Djomo and others 2011). Our result may reflect greater confidence in the allometric models (for example, we had species-specific wood density data for 96% of the total biomass, and used a combination of measured tree heights and species-specific diameter–height relationships fitted to a dataset of over 44,000 trees). However, it could also reflect the method used to quantify model uncertainty. We used SEM (Yanai and others 2010) whereas Chave and others (2004) and Djomo and others (2011) used the standard deviation of the regression, which would tend to overestimate the uncertainty in the population mean (Yanai and others 2010).

Uncertainty in Estimates of Carbon Change

Uncertainty in estimates of net change was more sensitive, with a 35% increase in uncertainty when measurement error and model uncertainty were taken into account. This increase in uncertainty was again dominated by measurement error, with the effect of model uncertainty being cancelled out through using the same allometric models to calculate carbon stock at both time periods (Chave and others 2004; Yanai and others 2012). The strong influence of measurement error on uncertainty in carbon change estimates could reflect the relatively small plot size (0.04 ha), for which measurements from a single large tree can strongly influence plot-level net change estimates. It also could reflect the much smaller sampling uncertainty associated with net change estimates obtained using a repeated-measures design. Relatively few studies have quantified the effect of both measurement error and model uncertainty on carbon change estimates. Phillips and others (2000) found that measurement error contributed only 0.1% of the total variance in net change estimates, whereas Clark (2002) and Muller-Landau and others (2013) found that measurement error relating to buttressed trees could significantly bias the resulting net change estimates. Our results suggest that measurement error is an important contributor to total variance in estimates of net carbon change, especially when the plot size is relatively small.

Programs such as REDD+ are designed to incentivize management of forests for increased carbon sequestration. Such programs, therefore, depend on the ability to link management activities to increases in carbon sequestration rates (Holdaway and others 2012; Pelletier and others 2012). Doing this linkage in a statistically robust manner requires full quantification of the uncertainty associated with carbon change estimates. In our case, inclusion of measurement error and model uncertainty increased the uncertainty in carbon change by 35%. To counteract these increased confidence intervals, programs such as REDD+ need to target situations where large carbon gains are likely (that is, large effect size), or increase their monitoring intensity to enhance statistical power to detect changes.

Caveats

Some potentially important sources of uncertainty were not assessed in this study. These include: uncertainties in wood decay classes and decay curves, which are likely to influence deadwood carbon estimates (Fraver and others 2013; Mason and others 2013); uncertainty in applying wood-density estimates obtained from intact live stems to a real-world sample that contains hollow or decaying stems (Clark and Kellner 2012); uncertainty in the belowground fraction when estimating total carbon stock (Mokany and others 2006); uncertainty in the carbon concentration of wood (Chave and others 2009); and uncertainty in remote sensing techniques used to estimate total forest area (Gibbs and others 2007; Foody 2010). We also did not quantify uncertainty associated with model selection, which is an important additional source of uncertainty for carbon stock estimates (Djomo and others 2011).

Our power analysis applies to the aboveground carbon pool for a national-scale sample of only 227 0.04-ha plots measured using standard forest methodology. New Zealand’s current natural forest carbon monitoring plot network (LUCAS) contains a total of 1,256 plots, approximately 900 of which are measured using the forest methodology described here. The remaining plots are located in shrubland and are measured using shrubland-specific techniques (Payton and others 2004). Very little work has been done to assess uncertainty in shrubland methods but experience has shown that these are much harder to implement in the field, and the allometric relationships for predicting carbon from shrubland are significantly less developed than are those for the forest environment (Coomes and others 2002). Further work is, therefore, required to quantify uncertainty associated with shrub plots and its contribution to estimated national carbon stock and stock change.

Practical Recommendations

Our results identify the key components of uncertainty in forest carbon estimates, and this information can be used to assist model development and allocation of effort in the field. Sampling uncertainty could be reduced by increasing the number of plots sampled or increasing plot size (Phillips and others 2000; Salk and others 2013). Model uncertainty could be reduced though increasing the numbers of individuals used to construct volume, tree height and wood density models (Chave and others 2005). Of all the sources of model uncertainty, wood density has the greatest relative uncertainty (Figure 3) and, therefore, should be prioritized. Wood density models could be improved either by increasing the number of species with species-specific wood densities, or by reducing uncertainty for species that already have species-specific estimates by sampling more individuals (Flores and Coomes 2011). Measurement error could be reduced by focusing efforts in the field on measurements that have the greatest influence on total uncertainty, in particular missed or double-counted stems (Muller-Landau and others 2013) and measurements of CWD volume. Staff allocation to key tasks should also be randomized to avoid measurement bias due to differences in interpretation and implementation of methods. Measurement error can never be eliminated, and in practice it is a matter of balancing the increase in data accuracy achieved through improved sampling strategies and a larger sample with the inevitable increase in resources (costs) required to achieve this increase (Butt and others 2013). Data-cleaning procedures can be used to correct for measurement errors at the analysis stage, but this approach may introduce even more bias and uncertainty and should be used with caution (Muller-Landau and others 2013). We recommend that measurement errors be accepted as unavoidable, and, therefore, be quantified and explicitly incorporated into any analysis. This study demonstrates that robust plot-based estimates of national carbon stock and carbon change can be obtained through inclusion of quantified estimates of sampling uncertainty, measurement error and model uncertainty, providing confidence and support for the use of plot-based carbon estimates for management and policy decision making.