Background

Simulation studies have long been used to investigate forest sampling (O’Regan and Palley 1965), as simulation is the only viable technique for testing the validity of a sampling design over a large number of samples. For example, a recent study to evaluate empirically the efficacy of different estimators in complex designs for large-scale biomass inventories (Ene et al. 2013) showed large differences between the actual and standard analytical variances. It also highlighted other methods that produced satisfactory statements of the precision of the estimates.

Airborne laser scanning (ALS) is increasingly becoming more accessible as a standard forest management tool (Næsset et al. 2004, Watt et al. 2013). Adequate spatial coverage has long been considered a critical feature of good forest inventory design. Auxiliary information from remote sensing, especially ALS, is more frequently available at the design stage so it is possible to construct a sample that is balanced both spatially and across the auxiliary variables. The benefits of this approach have been demonstrated by simulation (Grafström and Hedström 2013).

In stand or woodlot inventory, simple designs can be used to provide adequate information provided they are practical and cost efficient. While the inventory planner aims to ensure systematic spatial coverage when locating samples in a forest stand (Goulding and Lawrence 1992, Gordon 2005), there is little empirical information as to the importance of coverage, or guidance on the best way to achieve it. This simulation study was designed to examine the consequences of different spatial methods of sampling for mean stem volume when applied to even-aged plantation stands (in the 1 – 30 ha size range). The number of stems in a stand can be determined by a tally in small stands, or via stem-counting methods where remote imagery or ALS data are available (Culvenor 2002, Pont et al. 2015). The combination of mean stem volume and number of stems provides stand-level volume estimates. Two measures of a sampling method’s success were used for comparison: bias, the difference between the population mean and the average sample mean over a large number of samples, and reliability, the percentage of samples which contained the population mean within the calculated confidence interval.

Methods

Data

Data sets from five, even-aged stands of radiata pine (Pinus radiata D.Don) in New Zealand were assembled. Details of each stand are provided in Table 1. Stand 1 was approximately 10 times larger than each of the other four stands. Stand 3 was approximately half the age of the other four stands. These sets were census lists of every stem in a forest stand, including each stem’s location, breast-height diameter over-bark (DBH) and height. In cases where height had not been measured on every stem, a Petterson height-DBH equation (Schmidt, 1967) was fitted and the stem assigned a conditional, average height. The volume under-bark of each stem was derived from the DBH and height using a volume function (Kimberley and Beets 2007).

Table 1 Details of Data Sets

In the four small stands (Stand 2, 3, 4, 5), stems had been geo-located using ground-based, surveying methods.

The data for the largest stand (Stand 1) was obtained from an ALS set supported by 371 stems measured in ground plots. The ALS point cloud was processed to create a canopy–height model (CHM), represented as a grayscale image. Stand density estimated from the ground plots was used to define a suitable range of smoothing levels to be applied to the CHM image. A smoothed image was generated that corresponded to each of ten smoothing levels covering the range. A tree-detection algorithm was used to process each image so that the location (Figure 1), crown radius and height of every stem could be retrieved.

Figure 1
figure 1

Stand 1 shown as a Quasi-random group (Sobol) sample. The stand boundary is shown as a blue line, stem locations as green dots, selected stems as red crosses, plot centres as blue plus signs. The underlying map grid is labelled in metres.

Under the assumption that the crown radius and stem DBH are linearly related (Patton 1988, Madgwick 1994), the standardised distributions of DBH from the ground plots were compared with the standardised distributions of crown radius from the ten segmented CHM images. The Kolmogorov-Smirnof (KS) D statistic (Conover 1999) was used to select the best-matching distribution (corresponding to a smoothing level). A simple linear relationship was then used to estimate DBH for every stem from its standardised crown radius. The resulting distribution of DBH was strongly correlated with the actual distribution of DBH from the 371 measured stems (Pont et al. 2015).

Sampling methods

Computer programmes in C# were written to simulate the following sampling methods in such a way that they could be repeatedly applied to the list of stems from each stand:

Random sample

A pseudo-random number generator was used to select stems from the population with equal probability and without regard for their spatial location. A random-sampling method was included in this study as a benchmark because its characteristics are well known and conform to the requirements of sampling theory even though such methods often fail to provide good spatial coverage.

Stand-boundary sample

This approach was used to simulate a spatially biased sample by selecting stems based on their proximity to random points around the boundary of the stand. This method was included in the comparison to quantify any size difference between edge stems and interior stems as this would result in a biased estimate of stem volume.

Quasi-random group sample

A quasi-random sequence is less random than a pseudo-random sequence but is still useful for tasks such as numerical integration and optimisation because it tends to sample n-dimensional space “more uniformly” than random numbers. Quasi-random sampling also allows for additional sample points to be added, with continual improvement in accuracy if, say, a target precision has not been met after measuring an initial sample. It provides good spatial coverage with “random” locations, without the risk of the locations coinciding with some periodic variation in the population, a risk which grid-based sampling always carries.

For each sample, a set of Sobol (Sobol 1967) points in two dimensions was found that was within the stand, or within a certain radius of the stand boundary. All stems that were within this radius of each point were selected. Radii were used to correspond to circular plots of 0.01 hectares. An example of this type of sample is shown in Figure 1.

Zigzag-line transects

Zigzag-line transects (also known as Z-plots) can be arranged with equal angles, but this introduces a coverage bias. The Zigzag-line approach used here was an equal-spaced sampler as this has satisfactory coverage properties (Strindberg and Buckland 2004). Each transect was 2 m wide and each sample contained all the stems that fell within all transects. Z-plots are an effective way of traversing a small stand or woodlot in a single pass, while ensuring good coverage in a repeatable procedure.

Grid-based plots

A grid (with random origin and orientation) that extended beyond the stand boundary was overlaid onto the stand. All stems within a certain radius of each grid point within the stand were selected along with stems within a certain radius of grid points at a set distance from the stand boundary. Radii were used that corresponded to circular areas (plots) of 0.01, 0.02, 0.04 and 0.08 hectares, which provided four separate samples. Each selected stem was labelled with its grid point reference to facilitate cluster analysis. Grids are commonly used when planning plantation forest inventory as they usually result in good coverage and are simple to navigate.

Single-point sample

A random point was selected within the stand and the closest stems (for the required sample size) were selected. This would be a valid procedure if it is assumed that there is no effect of location/stand edge on stem volume.

Several existing methods for adjusting for stand-edge sampling have been shown to provide improved estimates (West, 2013), but incorporating such methods into the simulations outlined above would increase their complexity. Instead, a simpler method was applied whereby sample plots whose centres lay outside the stand were included in any sample only if they intersected the stand boundary (Flewelling and Iles 2004, Schmid-Haas 1982).

All nine sampling methods (including the four different plot areas arranged on grids) were applied to each stand. Tests using different start points for the same stand and/or sampling method showed that a thousand samples would produce an average mean stem volume with a coefficient of variation around 0.1%. This amount of variation was considered small enough not to mask any practical differences among methods. A thousand samples were drawn from random start points for each combination of stand and sampling method.

For the four small stands (2, 3, 4, 5), an average of 70 stems was selected in each sample while an average of 200 stems was selected in the large stand (Stand 1). Having similar sample size by method within stand affords easier comparison between methods in terms of measurement effort. Sample stems were selected without replacement.

For each sample, the mean and confidence interval of stem volume were calculated by treating the selected stems as a simple random sample (SRS). That is, the standard error was calculated as:

$$ {s}_{\overline{x}}=\sqrt{\frac{s^2}{n}\left(1-\frac{n}{N}\right)} $$
(1)

where s 2 is the sample variance, n is the number of stems sampled and N is the number of stems in the stand.

Confidence intervals for each sample were formed as:

$$ \overline{x}\pm {t}_{0.975,n-1}{s}_{\overline{x}} $$
(2)

where \( \overline{x} \) is the estimated mean stem volume and t is a Student’s t-test value.

Those samples selected using plot-based methods (quasi-random groups and grid-based) were also analysed as cluster samples using a ratio estimator. Cluster sampling saves costs by measuring a number of stems at each sample point and is more robust in the presence of spatial correlation between stems. However it is less efficient than random sampling and will produce estimates with poorer precision for the same number of measured stems.

Under cluster sampling using a ratio estimator, the mean stem volume is estimated as:

$$ \frac{{\displaystyle {\sum}_{i=1}^n{y}_i}}{{\displaystyle {\sum}_{i=1}^n{M}_i}} $$
(3)

where y i is the cluster total volume and M i is the stem count of the ith cluster. This is identical to the SRS estimate. The variance of the estimate is:

$$ \frac{N\left(N-n\right)}{M^2n\left(n-1\right)}{\displaystyle \sum_{i=1}^n{\left({y}_i-\overline{x}{M}_i\right)}^2} $$
(4)

where N and n are the number of clusters in the population and sample respectively and M is the number of stems in the population. This is a function of the variation only among cluster totals and so will be minimised if clusters are similar. A confidence interval was constructed around \( \overline{x} \).

Analysis of the samples from each combination of stand and sampling method provided the following statistics:

Bias

The bias is the mean of the thousand sample estimates of mean stem volume, minus the population mean, as a percentage of the mean of the estimates. The significance of the calculated Student’s t-test statistic (which tests H 0 , i.e. the mean of the sample estimates equals the population mean) was also calculated.

Design effect

To give an indication of the efficiency of a sampling method, the ratio of the variance of the estimate to the variance of the estimate from the random sample was formed. Design effects less than one indicate improvements in efficiency that may be due to the spatial coverage of the sample.

Reliability-SRS

This is the percentage of samples that contained the population mean in their individually calculated SRS 95% confidence interval. For one thousand samples, this percentage is expected to lie in the interval from 93.6% to 96.4%.

Reliability-cluster

The same approach was used here as for the SRS Reliability but using the 95% confidence interval from the cluster analysis.

Results and discussion

Results from the simulations for all stands are shown in Table 2. Sampling the stand boundary produced a biased estimate for all five stands tested. The average overestimate for the four older stands (1, 2, 4 & 5) was 12% (range 11.54 – 13.52). Results for the younger stand (Stand 3) showed less bias (2.83%), which is possibly due to recent thinning of this stand that minimised the difference between the boundary and interior stems. The magnitude of this difference is likely to depend on the proximity and age class of adjacent stands as well as the time the edge stems have had to respond to their growing conditions.

Table 2 Simulation Results 1 by Stand

The remaining sampling methods showed little bias. Most estimates were within ±0.5% of the population mean (Figure 2), although the Single-point sampling method was clearly less accurate than the others. The Zigzag method produced a significantly biased estimate for four of the five stands. This may be due to a subtle under-sampling of the stand edge, although the bias in all stands was within ± 1%.

Figure 2
figure 2

Bias in the estimate of stem volume for eight of the sampling methods studied. Bias in the average sample estimate of mean stem volume, as a percentage of the average estimate, by stand and sampling method. Results from the Stand-boundary sampling method are not shown as the bias was an order of magnitude larger than for most other methods.

Design effect indicates the efficiency of the design relative to a SRS. There appears to be no consistent improvement in efficiency by explicitly including spatial coverage in the sampling design (Table 2). Quasi-random, Zigzag, and Grid 0.01, 0.02, and 0.04 ha groups all showed a range of design effects centred near a value of 1. However, there is indication of a loss in efficiency as plot size exceeds 0.04 ha (Table 2). In the stands used in this study, sample designs with plots larger than 0.04 ha are likely to require an increased sample size to compensate for the drop in efficiency if the same level of precision is required.

Under SRS, grid plots greater than 0.02 ha generated less reliable estimates of the confidence intervals than smaller plots (Figure 3). The most reliable, spatially dispersed method was that for grid-based 0.02 ha plots. This result is similar to the results of Morrison et al. (2008) who found “grid-based systematic designs were more efficient and practically implemented” than other methods that were compared in a simulation study of sampling rare populations. Theoretical studies (Comas et al. 2011) have also pointed to the worth of using a sampling grid for clustered populations.

Figure 3
figure 3

Reliability (SRS) for eight of the sampling methods studied. The percentage of SRS confidence intervals that included the population mean, by stand and sampling method. For 1000 samples, this percentage is expected to lie in the interval from 93.6% to 96.4%. Results from the Stand-boundary sampling method are not shown as the reliability was less than 14% in all but Stand 3.

The reliability of the confidence intervals of the clustered samples calculated using the ratio estimator was poor (Table 2). All these confidence intervals were overly optimistic with most methods showing reliability values of less than 93%.

Conclusions

As expected, sampling the stand boundary produced an over-estimate of stem volume. Stand-edge stems were about 12% larger than interior stems in stands 1, 2, 4 and 5 (aged 22 to 26 years old).

Most sampling methods produced little bias and good reliability when data were analysed in the form of simple random samples. This result provides the inventory forester with some scope to design sampling procedures that will be practical and operationally efficient while avoiding bias and still producing reliable confidence intervals. Plots 0.02 - 0.04 ha in size, located systematically on a grid with random orientation and origin, gave some of the most unbiased and reliable estimates. However, the Zigzag sampling method may be appropriate for small stands as it shows little bias, good reliability and is likely to be operationally efficient.

Reliable estimates of mean stem volume were obtained using simple random samples even though this approach disregards the clustered nature of plot-based samples. It is possible that the low stand density minimised any spatial correlation among stems in the stands studied, so it would be informative to test this explicitly and repeat the simulations in stands covering a range of stand densities.