Introduction

Multipurpose resource inventories have to fulfil several demands (Lund 1998), and their methods are usually evaluated regarding efficiency, which means that a required precision should be achieved with a minimum of inventory costs or that the maximum precision should be achieved with predefined inventory costs. Therefore, different sampling procedures have been developed over the last decades with the aim of cost reduction in mind. An established approach is to use auxiliary variables, the inventory of which is cheaper than that of the target variables.

One such method is double sampling for stratification (2st). This is a well-known, widely used and efficient method (Cochran 1977; de Vries 1986; Schreuder et al. 1993; Köhl 1994; Särndal et al. 2003; Gregoire and Valentine 2008; Mandallaz 2008), which has recently been studied under the infinite population approach (Saborowski et al. 2010). Scott and Köhl (1994) extended 2st by sampling with partial replacement (SPR). In the first phase of this procedure, all sampling units are stratified according to specific rules with help of qualitative variables. Often this is done based on aerial images, which serve as a source of auxiliary variables. After the stratification, within-strata subsamples of the first-phase units are inventoried; in forest inventories, it is common to do this with terrestrial sampling. Even though the costs of this sampling procedure are relatively low in comparison with other methods (Brassel and Köhl 2001; Saborowski et al. 2010), a further cost reduction is desirable.

A special opportunity to do so occurs when data from a previous inventory exist as is the case with periodic inventories. Saborowski et al. (2010) showed how 2st-sampling can be applied in periodic inventories with optimised allocation of second-phase units. In periodic inventories, one may be willing to accept a slight loss of precision regularly on every second occasion, or at least temporarily on one occasion in times of small budgets, if that is accompanied by a remarkable cost reduction. Such “intermediate” low-cost inventories are known, for example, from forest disease inventories in Germany, where the regular square grid of 4 km × 4 km was reduced to 8 km × 8 km for intermediate occasions until 2005, when the 8 km × 8 km grid became the regular grid.

Under a simple one-phase design for the periodic inventories, one might use double sampling for regression using the plot measurements from the previous inventory as an auxiliary variable (regressor) to compensate for the reduced sample size of the current inventory. Here, we want to deal with the generally more efficient 2st-design, which could be replaced temporarily, or in a fixed cycle on every second occasion, by a new three-phase design. The proposed design combines first-phase stratification as applied in the 2st-design and double sampling for regression (2lr) (Cochran 1977; Särndal et al. 2003; Mandallaz 2008) based on the finite number of second-phase plots within strata.

Moreover, we use not only the most recent preceding plot measurements as auxiliary variable, but also their updates predicted by a growth model that considers the current silvicultural policy, at least to a certain extent, and we compare the efficiency of both approaches.

The three-phase design is expected to account for different within-strata variances of the target variable, what particularly will occur in case of volume or basal area if age classes or species groups are used as strata, as well as for regression models varying among strata (Fig. 1). Thus, an integration of 2st and 2lr in a three-phase design seems to be a promising design, because it combines the strengths of both sampling schemes. The stratification helps to create more homogeneous subpopulations, whereas the regression includes additional information at low costs based on the preceding inventory.

Fig. 1
figure 1

Three samples of size 15, showing different relationships between x and y. The overall relationship misapplies these different relationships

A combination of current sample plot measurements and model-based updates of previous inventories was also suggested by van Deusen (1996) in a rotating panel context. The difference from our setting is that he had to deal with auxiliary data from a time series of previous inventories, where the target variable currently measured on a subsample of all plots has to be predicted based on data that were measured the furthest in the past. Sampling with partial replacement (Gregoire 2005) is related to our approach, insofar as we choose a subsample to estimate the regression coefficients and omit the rest of the sampling units from the most recent occasion. But the omitted units are not replaced here by new ones, as it would be done with SPR, because we use subsampling as a measure for cost reduction.

Forest growth models have experienced a rapid development during the last years (Pretzsch and Ďurský 2001; Pretzsch 2002, 2009; Schmid et al. 2006; Albrecht et al. 2009; Härkönen et al. 2010; Vospernik et al. 2010), and their forecasts have become more and more reliable. Therefore, it should be possible to use the results of these growth simulations in forest inventories. In a previous study (von Lüpke et al. 2011), 2st and growth model–based updates have been combined in a composite estimator after Schaible (1978). The mean squared error (MSE) of this estimator—as a measure of precision—is calculated using the estimated bias of the simulation results. Due to the fact that this bias has been considerable high, this approach could not reduce the number of sample points remarkably. A regression estimator seems to be the more promising approach because it uses the correlations between previous and current inventories, which are expected to be high.

In the following article, we present results that have been obtained for the three-phase estimator that combines 2lr with 2st. In the case study, aerial images were used as auxiliary variable to identify strata and (updated) data from the previous inventory as volume predictors in a regression model.

A three-phase estimator for stratification and regression

Due to the fact that the estimator assumes the infinite population approach in the first phase, a short explanation of the approach seems appropriate. Whereas the finite population approach assumes that the study area consists of a finite number of non-overlapping sampling units, the infinite population approach assumes point sampling in a given area. The local value of the target variable at a sample point is defined by the tree data within a sample plot assigned to the point. An obvious disadvantage of the first approach is that not all shapes of sampling units fulfil the assumptions. With circles for example it is impossible to sample the whole study area without overlaps. Therefore, the infinite population approach is more realistic and preferable for forest inventory; a comprehensive theory with applications can be found in Mandallaz (2008).

For all the schemes presented here, simple random sampling (SRS) is assumed in the first phase. In practice, often only the first sample point is chosen randomly and from that starting point a systematic grid is constructed to find the rest. Generally unbiased variance-estimators do not exist in case of systematic sampling; therefore, often the SRS-estimators are applied. It can be justified by the fact that they lead to an overestimation in most cases and thus are assumed to be conservative estimators (Gregoire and Valentine 2008; Mandallaz 2008).

Double sampling for stratification

Two phases can be distinguished in this sampling scheme. After stratification of the first-phase sample plots \((n^{\prime}),\) measurements only take place in a sub-sample (n). To estimate the mean of the target variable (e.g. dbh, basal area or volume), the strata means \((\overline{y}_h)\) are weighted with the proportions of first-phase sample points per stratum \((n_h^{\prime}/n^{\prime}=w_h),\) as can be seen in Eq. 1 (see e.g. Cochran 1977).

$$ \widehat{\overline{Y}}_{2st}=\sum^L_{h=1}w_h\frac{1}{n_h} \sum_{i=1}^{n_h}y_{hi}=\sum^L_{h=1}w_h\overline{y}_h $$
(1)

Eq. 2 shows an unbiased estimator for the variance of this sampling procedure under the infinite population approach (Saborowski et al. 2010), where \(s_{h}^{{2}}\) is the estimator for the within-stratum variance of the target variable (Eq. 3) and \(\nu_h=n_h/n_h^{\prime}\) the proportion of terrestrial sample points per stratum.

$$ \widehat{V}\left(\widehat{\overline{Y}}_{2st}\right)=\frac{1}{n^{\prime}-1} \left(\sum^L_{h=1}\frac{n_h^{\prime}-1}{n^{\prime}}\frac{s_h^2}{\nu_h}+ \sum^L_{h=1}w_h\left(\overline{y}_h-\widehat{\overline{Y}}_{2st}\right)^2\right) $$
(2)
$$ s_h^2=\frac{1}{n_h-1}\sum_{i=1}^{n_h}\left(y_{hi}-\overline{y}_h\right)^2 $$
(3)

Double sampling for regression

In this sampling procedure, which we will later use according to the finite population approach given the \(n_h^{\prime}\) first-phase samples within strata, the auxiliary variable (x) is sampled at all first-phase plots \((n^{\prime}).\) Again, the target variable (y) is only measured in a sub-sample (n). For the estimation of the mean of this target variable (Eq. 4), the sample means of the auxiliary variable, calculated from the sample points of phases one \((\overline{x}^{\prime})\) and two \((\overline{x}),\) are required. Besides, the sample mean of the target variable \((\overline{y})\) and the estimated regression coefficient b (Eq. 5) are used (Cochran 1977).

$$ \widehat{\overline{Y}}_{2lr}=\overline{y}+b\left(\overline{x}^{\prime} -\overline{x}\right) $$
(4)
$$ b=\frac{\sum_{i=1}^n\left(y_i-\overline{y}\right)\left(x_i-\overline{x}\right)} {\sum_{i=1}^n\left(x_i-\overline{x}\right)^2} $$
(5)

An estimator for the variance is given in Cochran (1977), formula (12.67), with the variance estimator of the target variable s 2 y and s 2 y.x being an unbiased estimator of S 2(1 − R 2), where S 2 is the true variance of y and R the correlation coefficient between x and y. Here, N stands for the total number of all possible sampling units in the study area. Since we will use 2lr in our three-phase estimator conditionally on the first-phase sample within each of the strata, the finite population approach is appropriate with N replaced by \(n_h^{\prime},\,n^{\prime}\) by n h and n by \(n_{h}^{*}\) (see Eq. 7 and Appendix A.4).

$$ \widehat{V}\left(\widehat{\overline{Y}}_{2lr}\right)=\frac{s_{y.x}^2}{n} +\frac{s_y^2-s_{y.x}^2}{n^{\prime}}-\frac{s_y^2}{N} $$
(6)

Three-phase sampling for stratification and regression

The estimator used in this study was suggested by Saborowski (1994), who presented it together with a variance estimator under the finite population approach. In total, three phases can be distinguished in this procedure (Fig. 2). In the first phase, all sampling units \((n^{\prime})\) are stratified into L strata \((n^{\prime}=\sum\nolimits_{h=1}^L n_h^{\prime}),\) and in the second-phase measurements of an auxiliary variable x are collected in a subsample of every stratum \(\left( {n_{h} = \nu _{h} n_{h}^{\prime } } \right).\) Data of the target variable are finally measured in phase three in a further subsample of the second-phase sample per stratum \(\left( {n_{h}^{*} = \nu _{h}^{*} n_{h} } \right).\) To estimate the mean of the target variable, the differences between the means of the auxiliary variable in the second and the third phase are used together with the mean of the target variable estimated from phase three.

Fig. 2
figure 2

Sampling procedure of the three-phase design

The mean of the target variable can be estimated using Eq. 7, where \(\overline{y}_h^*\) denotes the sample mean of the target variable in a sub-sample of the second-phase sample with sample size \(n_{h}^{*}\) in stratum h. \(\overline{x}_h\) denotes the sample mean of the auxiliary variable in stratum h (second-phase sample size n h ), and \(\overline{x}_h^*\) stands for the mean of the auxiliary variable in stratum h calculated from phase three with sample size \(n_{h}^{*}.\) The proportion of first-phase sample points per stratum is used for weighting the strata means.

$$ \widehat{\overline{Y}}_{2st,2lr}=\sum^L_{h=1}{\frac{n_h^{\prime}}{n^{\prime}} \widehat{\overline{Y}}_{h,2lr}}=\sum^L_{h=1}w_h \left(\overline{y}_h^*+b_h\left(\overline{x}_h-\overline{x}_h^*\right)\right) $$
(7)

The estimated regression coefficient b h is calculated per stratum as follows.

$$ b_h=\frac{\sum_{j=1}^{n_h^*}\left(y_{hj}-\overline{y}_h^*\right) \left(x_{hj}-\overline{x}_h^*\right)}{\sum_{j=1}^{n_h^*} \left(x_{hj}-\overline{x}_h^*\right)^2} $$
(8)

x hj and y hj are the auxiliary and the target variable at unit j of stratum h.

Estimator 7 is identical with the so-called updated first occasion mean of Scott and Köhl (1994), which is one of two components of their stratified SPR estimator, but their variance estimator is based on the finite population approach of Cochran (1977).

The approximate variance under the infinite population approach for the first phase, as a measure of precision of estimation, is given by Eq. 9, an estimator by Eq. 10 (for the proofs see "Appendix"). \(s_{h}^{{*2}}\) and \(r_{h}^{{*2}}\) are the empirical variance and the squared empirical correlation between x and y of the third-phase sample in stratum \(h,\,{s_h^{\prime}}^2\) and \({r_h^{\prime}}^2\) the respective statistics of the first-phase samples. The structure of the variance and its estimator, simply a sum of the respective statistic for pure 2st-sampling and an additional term accounting for the third-phase variability, is a direct consequence of the well-known variance decomposition given in Appendix A.1.

$$ \begin{aligned} &V\left(\widehat{\overline{Y}}_{2st,2lr}\right)\\ &\approx\frac{1}{n^{\prime}}S^2+E\frac{1}{n^{\prime}}\sum_{h=1}^L w_h\left(\frac{{s_h^{\prime}}^2\left(1-{r_h^{\prime}}^2\right)}{\nu_h^* \nu_h}+\frac{{s_h^{\prime}}^2 {r_h^{\prime}}^2}{\nu_h}-{s_h^{\prime}}^2\right)\\ &=V\left(\widehat{\overline{Y}}_{2st}\right)+E\frac{1}{n^{\prime}} \sum_{h=1}^L w_h\left(\frac{1}{\nu_h^*}-1\right)\frac{{s_h^{\prime}}^2 \left(1-{r_h^{\prime}}^2\right)}{\nu_h} \end{aligned} $$
(9)
$$ \begin{aligned} &\widehat{V}\left(\widehat{\overline{Y}}_{2st,2lr}\right)\\ &=\widehat{V}\left(\widehat{\overline{Y}}_{2st}\right)+ \frac{1}{n^{\prime}}\sum^L_{h=1}w_h\left(\frac{1}{\nu_h^*}-1\right) \frac{{s_h^*}^2\left(1-{r_h^*}^2\right)}{\nu_h}\frac{n_h^*-1}{n_h^*-2} \end{aligned} $$
(10)

The expectation in Eq. 9 is calculated over all first-phase samples of size \(n^{\prime}.\) With increasing correlations \({r_{h}^{\prime}}^{2}\) the variance of the three-phase estimator converges from above to the variance of the 2st estimator. Foresters are usually also interested in the relative Sampling Error (rel. SE) as given in Eq. 11.

$$ {\text{rel}}{\text{.SE}} = \frac{{\sqrt {{\text{Var}}\hat{\bar{Y}}_{{2st,2lr}} } }}{{\hat{\bar{Y}}_{{2st,2lr}} }} $$
(11)

Case study

Sampling scheme and inventory data

Since 1999 the Forest District Inventory of Lower Saxony (Germany) has been carried out in a cycle of approximately ten years according to a 2st design (Böckmann et al. 1998; Saborowski et al. 2010). In the first phase of this sampling procedure, sample points are located in a 100 m × 100 m grid, and CIR aerial images are used to assess stand age and type at these points. As a result of this assessment, every point is assigned to one of eight strata depending on dominating species group (DEC: Deciduous; CON: Coniferous) and age class (1:≤40 years; 2:>40 − 80 years; 3:>80 − 120 years; 4:>120 years). As Saborowski et al. (2010) point out, this stratification assumes (1) a close relationship between age and species group and volume, (2) that the distinction of four age classes and two species groups can easily be done using aerial images, and (3) that the optimum allocation is expected to hold, at least approximately, for a repeated inventory. A certain proportion (ν h ) of first-phase points differing among the strata is systematically chosen in the second phase from a list of all \(n_h^{\prime}\) points of stratum h. These proportions differ because the estimation precision required by the forest administration was higher for trees above a specified dbh-threshold (5 % rel. SE) and lower for smaller trees (down to 30 % rel. SE). At the second-phase points, two concentric plots with a radius of 6 m (for trees with 7 cm ≤ dbh < 30 cm) and 13 m (trees with dbh ≥ 30 cm), respectively, are established and inventoried. In four forest districts of Lower Saxony, Liebenburg, Reinhausen, Grünenplan and Saupark, the inventory has meanwhile been carried out twice. Differing from the regular ten-year time span between two inventories, it ranged here from seven to ten years. A new stratification with the help of aerial images did not take place at the second occasion, and so the stratification of the first inventory was used. Due to problems with the identification of the exact plot position, not all plots surveyed from the first occasion could be resampled. In total, data from 27,332 first- and 6,343 second-phase plots were used for this case study (Table 1). For these plots, data from two occasions were available. In our case study, we assume random sampling in the first and second phase, as well as for the subsampling in the third phase, which was not carried out in practice. The third phase was only virtually implemented in our study.

Table 1 First- and second-phase sample sizes in the eight strata of the four forest districts

Tree growth simulation

The simulations were carried out with the program WaldPlaner 2.0, which uses the statistical individual-tree growth model BWINPro (Nagel and Schmidt 2006). This program was developed by the Northwest German Forest Research Station and is used in the planning process of the Forest Service in Lower Saxony (Nagel and Schmidt 2006). Therefore, the default settings follow the Federal State silvicultural program (LÖWE), which aims to rise the proportion of mixed and broadleafed stands. Due to the fact that it was parameterised with data from Northern Germany, particularly from Lower Saxony, the results of this simulator are expected to be more reliable for our case study than the results of other growth simulators such as SILVA or SIBYLA, which have been parameterised with data from Southern Germany and Slovakia, respectively (Fabrika and Ďurský 2006; Pretzsch et al. 2006). Different studies (e.g. Vospernik et al. 2010) show that the growth projections of this program provide reasonable results.

WaldPlaner 2.0 generates a model stand of predetermined extent driven by the input data for better representation of neighbourhood and for the minimisation of edge-effects. This model stand is built with clones of the sample-trees. Depending on their dbh and differing selection probabilities (concentric circles), the measured trees are cloned several times, smaller trees (dbh < 30 cm) more often than bigger ones (dbh ≥ 30 cm). The coordinates of these clone-trees are initialised randomly. Afterwards, an algorithm moves the coordinates until a constellation with little competition is achieved. For height and diameter increment, a normally distributed error is computed on the tree level.

The data from the second phase of the first inventory were used for simulation runs using the program WaldPlaner 2.0. The sizes of the model stands were 0.2 ha, and we derived key figures, such as volume per ha, from these stands and assigned them to the sample units. We tested different realistic parameterisations, but due to the fact that in most target populations the influence of the parameterisations on the sampling error of the inventory was extremely low, we used the results of the simulation runs with default settings for further calculations. We also tested the effect of different initialisations and predictions in the Forest District Liebenburg with ten different simulations on the correlations between simulated and measured values. The values were calculated stratum-wise for every target population, as needed for Eq. 10. Due to the fact that the effect was very small (the range of the squared correlations can be described by q 0.25 = 0.0004 and q 0.75 = 0.025), we used the results of just one simulation run in each district and did not compute mean values. In Lower Saxony clear-cuts are not allowed as a regular silvicultural treatment, and therefore, it is not assumed to happen between the two occasions of the inventory.

Evaluation procedure

With this case study, we tried to figure out (1) the performance of the new estimator and (2) the effect of using growth model–based updates instead of original data from the first inventory occasion. For the latter, all steps explained in the following were done with these two types of data as auxiliary variable in the regression part of the new estimator. The measured volumes per ha of the second occasion served as values of the dependent variable.

Correlations between these two variables were calculated as required for Eq. 10. Differing from the most general case in that equation, we used the same third-phase proportion in all strata \(\left( {v_{h}^{*} = v^{*} } \right)\) instead of proportions differing among strata. Values for \(v^{*}\) ranged from 1/n to 1. Wherever an estimation of the volume was required we used the value that was calculated with the 2st-estimator and all terrestrial sampling points. All calculations were carried out for nine different target populations, defined by dbh and tree species (Table 2). Whereas the volume per tree was calculated within the growth model, all other calculations were done with the statistical software package R (R Development Core Team 2010).

Table 2 The nine target populations in the case study

Correlations between (updated) first occasion and second occasion volumes were calculated within each stratum and across all strata for every target population. Furthermore, we fitted linear regressions for every target population, separately for each stratum and over all strata.

The rel. SEs of the new estimator were compared with the corresponding values calculated from the data of the second occasion according to the classical 2st approach. Because the variances and thus the rel. SEs of the two estimators are identical if the values of all second-phase plots (n) are included in the calculations (Eq. 10, \(v^{*} = 1\)), we looked at the proportion of saved sample plots in dependance on the relative increase of the rel. SE.

To compare the two different types of auxiliary data in the regression estimator, we calculated the differences between the proportions of saved sample points of these estimations at the same increases of rel. SE.

Results

The results of the inventory on the second occasion show that the actual 2st scheme is appropriate to generate good and reliable results (Table 3 in the "Appendix"). In 29 of 36 target populations, the achieved rel. SE is below or equal to the requested precision. The estimated rel. SEs vary between 3.04 % (Beech 25–50 in Reinhausen) and 18.33 % (Oak <25 in Liebenburg). The precision differs among forest districts, species and diameter classes. Whereas the precision is very good for the Beech and Spruce target populations, it is lower for the Oaks. Only in the Forest District Liebenburg was the target precision achieved for less than 75 % of the target populations. As for the precisions in the different diameter classes, the 2st scheme provides the requested rel. SE in all small and medium, but only in 5 of the 12 big diameter classes, although in 2/3 of the latter the rel. SE is below 7 %.

Growth model–based updates

The relationship between simulated and measured volumes, indicated by Pearson’s correlation coefficient (see Table 4 in the "Appendix"), is very strong. Values, calculated over all strata, vary between 0.73 and 0.93 among target populations. Calculation of the correlation coefficients within each stratum shows that the values vary considerably more among the eight strata. While for some target populations, only weaker correlations (−0.01 ≤ r < 0.5) could be found in one or more strata, a very strong correlation (r ≥ 0.75) appears for other target populations in all strata. This leads to a broad range of correlations including extremes such as −0.01 and 1.00, the quantile q 0.25 is 0.71 and q 0.75 is 0.9. Comparing the correlations of the different species groups, it becomes obvious that the correlations of the Beech group are very good in most cases (r > 0.75 in 86 %). In contrast, the values for the Spruce groups indicate weaker relationships (0.5 < r ≤ 0.75 in 40 %) in a lot of strata.

Calculation of linear regressions showed that the relationships between measured and simulated volumes vary remarkably among strata. For some target populations, the slope is the same in all strata; hence, no interaction between stratum and slope exists. Other target populations show a high variety of slope-values, indicating strong interactions between stratum and slope. Overall the slope parameters range from −0.01 to 3.14 and the intercepts from −2.19 to 279.89. The r 2-values of the linear regressions vary from 0 to 1; the quantiles (q 0.25 = 0.56, q 0.75 = 0.82) indicate that these regressions are able to explain the variability well in most cases.

The results for the new estimator (Fig. 3) show that it could reduce the number of sample plots remarkably compared with pure 2st, accepting a certain decrease in precision. In the three diameter classes, the proportions of saved sample points are highest for the Oaks and lowest for the Spruces. The range of the proportions of saved sample points between forest districts is very narrow for the Beech populations and wider for the two other species groups.

Fig. 3
figure 3

The proportion of saved sample points (%) as a function of increasing relative sampling error (%) in the small (a), medium (b) and big (c) diameter classes in the four forest districts. The shaded areas indicate the spread of values across the forest districts. In the regression estimator, the correlations between growth model–based updates and measured values at the second occasion were used

For example, for the big Beeches (Fig. 3c), a 10 % higher rel. SE, compared with the 2st procedure with full second-phase sample size n, could be achieved with the 2st,2lr-procedure using 22–33 % (depending on the district) less sample plots on the second occasion than with the reduced 2st-procedure. For the Spruces, that span is from 10 % to 23 %, for the Oaks from 25 % to 35 %. For the smaller diameter classes (Fig. 3a, b), these savings are even higher.

Data from the first inventory occasion

Over all strata, the values of Pearson’s correlation coefficient vary between 0.6 and 0.97 among target populations (Table 4 in the "Appendix"). Like for the case described above, the correlation coefficients vary considerably when calculated stratum-wise. The values range from −0.03 to 1, q 0.25 is 0.66 and q 0.75 0.89. In general, the correlations are highest for the Beech target populations and lowest for the Spruce target populations.

Within the target populations, the relationships between the data of the first and the second occasion also vary among strata, the slope parameters between −0.04 and 3.50. The values for the intercepts range from −8.72 to 265.56. For some target populations, strong interactions between stratum and slope exist; for other target populations, no interaction is detectable. The r 2 of the linear regressions vary between 0 and 1; the corresponding quantiles are 0.57 (q 0.25) and 0.84 (q 0.75). Hence, it seems as if the regressions are mostly able to explain the variability well.

In all diameter classes, the highest proportions of saved sample points could be achieved for the Oaks and the lowest for the Spruces (Fig. 4). Again the range of the results is narrow for the Beeches and wider for the two other species.

Fig. 4
figure 4

The proportion of saved sample points (%) as a function of increasing relative sampling error (%) in the small (a), medium (b) and big (c) diameter classes in the four forest districts. The shaded areas indicate the spread of values across the forest districts. In the regression estimator, the correlations between measured values at the first and second occasion were used

Comparison of input data

In most of the cases, the use of growth model–based updates clearly improves the performance of the 2st,2lr-estimator (Fig. 5) compared with the approach based on the measurements of occasion 1. Only for the Oaks with big diameters the use of the data from the first occasion leads to considerable better results.

Fig. 5
figure 5

The differences of the proportions of saved sample points (%) between the results of the 2st,2lr with simulated values and with values of the first inventory. Results are shown as a function of increasing relative sampling error (%) for the small (a), medium (b) and big (c) diameter classes in the four forest districts. The shaded areas indicate the spread of values across the forest districts

Discussion

Coming back to the initial question of the general performance of the 2st,2lr-estimator, we state that it is possible to save sample plots and thereby inventory costs, if a certain decrease in precision is accepted. The extent of savings depends on the correlation between the auxiliary and the original data. The main result is that in almost all target populations of our case study, the correlation between updated data from the first and measured data from the second occasion is higher than the one between measured data from the first and second occasion, yielding a higher cost-saving potential for the growth model–based updates of the previous inventory data.

Our results are mostly, apart from the large Oaks, consistent with different other studies (e.g. Vospernik et al. 2010), which show that WaldPlaner 2.0 is able to produce realistic results. The use of the results of the simulation runs with default settings can be justified by the extremely low influence of these settings on the sampling errors of the inventory and the fact that the default settings follow the silvicultural program of Lower Saxony. Moreover, changes of these settings can in principle be made in the model, but they require further detailed knowledge of the thinning strategies applied in the forest districts, which are difficult to quantify in practice. A reason for the similarity between the simulation runs can be seen in the short simulation period of approximately ten years. In longer simulation periods, the differences between these runs are expected to be bigger. Also the effect of different initialisations and simulation runs is expected to be bigger in longer simulation periods. With larger variability among different runs, several simulations should be carried out and the mean value be used, because the auxiliary variable is assumed to be non-random. In our case study, the variability was negligible.

The many high values of Pearson’s correlation coefficient show that the growth projections produce reasonable results. Hence, WaldPlaner 2.0 seems to be a suitable tool for this study. However, it has to be considered that points, where volume of trees in a certain target population has been neither measured nor simulated, are included in the calculation and raise the correlation. It is interesting to note that the correlation for some target populations is very high in strata, where one would not expect a high occurrence of this population, for example, the Oaks in the coniferous strata of Liebenburg. A possible explanation for these high correlations might be seen in the high number of plots with a stand volume of 0 m 3/ha in the considered target population.

Even though the correlations are high in most cases, a further increase of these values is desirable but can hardly be achieved with the current growth models for several reasons: (1) Extreme differences between measured and simulated volumes can partly be explained by calamities. At some points, the standing volume has been reduced through insect outbreaks, windstorms or fire. These calamities could not be simulated by the growth model and therefore the differences between the volumes are big at these points. (2) Another reason for discrepancies between the two volumes can be seen in the strict thinning routine in the model, where all trees are harvested when they reach the species-specific target-diameter. In reality not every tree, which reaches the corresponding target-diameter, is harvested. Rather the neighbourhood-situation is evaluated by the forester and tree-harvesting follows his assessment. The target-diameter is handled with much more flexibility in practical forestry than in the growth model. In our case study, this may especially be the case for the Oaks with big diameters. (3) The combination of using clone-trees in the model and of analysing the results per target population might explain some of the observed differences between the two values. In reality, a target population might disappear, when only one tree is harvested and no other trees of this target population exist. Due to the use of clone-trees, it is unlikely that a target population disappears in the model.

A recent approach for the improvement of growth models is the inclusion of calamities, such as infestation by bark beetles (Overbeck and Schmidt 2012) or windstorms (Schmidt et al. 2010). Moreover, new approaches for modelling height growth exist. Further enhancement of growth models can be expected from parameterisation of additional tree species, climate-sensitive and local calibration or an improved modelling of silvicultural treatments.

The advantage of the new approach is that it uses the correlations between simulations and measurements that are high, even though the deviations of the simulations from the measurements can be quite large. With the achieved precisions, this procedure is attractive for periodic forest inventories under temporarily restrictive financial constraints. This is because the growth projections for the regression part of the estimator require a data base of recent inventory data, where more terrestrial plots are measured than is planned for the current, reduced inventory.

The results for the linear regressions support the findings about the correlation coefficients, and the broad range of possible relationships within the different strata becomes obvious. Slope parameters of 0 or smaller indicate a bad performance of the growth model or a volume reduction between the two occasions. These cases are assumed to occur in target populations with a low number of plots having a stand volume > 0 m 3/ha. From the slope parameters, it can be seen that the growth model overestimates the stand volume in some strata and underestimates it in others.

Of course, the new estimator could not reach the target precision in cases where the 2st scheme was already above. Looking at the savings that could be achieved with the new sampling procedure, it has to be noted that additional costs for the simulations and calculations incur. However, these costs will be negligible compared with those of terrestrial sampling.

Conclusions

Comparing classical 2st with the approach proposed here, it is clear that the new approach coincides with simple 2st if the same second- and third-phase sample size is realised. The new approach becomes advantageous when the sample size of the current inventory is reduced and hence a lower accuracy of estimation is accepted. In these cases, the savings of sample plots and resultant inventory costs are remarkable. The 2st, 2lr-estimator can be used with data from the last occasion or with growth model–based updates. Using the latter allows for potentially higher savings, due to higher correlations.

The superiority of this three-phase estimator over the composite estimator analysed earlier (von Lüpke et al. 2011) can be explained by the often large bias of the WaldPlaner 2.0 predictions as one component of the composite estimator. Despite this large bias, the correlations with plot measurements are usually high and can successfully be exploited in the regression estimator, which is part of the new three-phase approach. Of course this sampling scheme cannot be applied continuously in forest inventories, because a continuous reduction of sample sizes would occur. Thus, we recommend its use as a low-cost inventory alternating with the regular full double sampling inventory or as a temporary intermediate inventory between two regular sampling occasions of a continuous forest inventory.

Assuming additional enhancement of forest growth models through, for example, model calibration implying higher estimation accuracies, the results of this estimator are likely to be further improved.