Effect of permanent plots on the relative efficiency of spatially balanced sampling in a national forest inventory
Abstract
Key message
Using spatially balanced sampling utilizing auxiliary information in the design phase can enhance the design efficiency of national forest inventory. These gains decreased with increasing proportion of permanent plots in the sample. Using semi-permanent plots, changing every n th inventory round, instead of permanent plots, reduced this phenomenon. Further studies for accounting the permanent sample when selecting temporary sample are needed.
Context
National forest inventories (NFIs) produce national- and regional-level statistics for sustainability assessment and decision-making. Using an interpreted satellite image as auxiliary information in the design phase improved the relative efficiency (RE). Spatially balanced sampling through local pivotal method (LPM) used for selection of clusters of sample plots is designed for temporary sample; thus, the method was tested in a NFI design with both permanent and temporary clusters.
Aims
We estimated LPM method and stratified sampling for a NFI designed for successive occasions, where the clusters are permanent, semi-permanent, or temporary being replaced: never, every nth, and every inventory round, respectively.
Methods
REs of sampling designs against systematic sampling were studied with simulations of inventory sampling.
Results
The larger the proportion of permanent clusters the smaller benefits gained with LPM. REs of stratified sampling were not depending on the proportion of permanent clusters. The semi-permanent sampling with LPM removed the previously described decrease and resulted in the largest REs.
Conclusion
Sampling strategies with semi-permanent clusters were the most efficient, yet not necessarily optimal for all inventory variables. Further development of method to simultaneously take into account the distribution of permanent sample when selecting temporary or semi-temporary sample is desired since it could increase the design efficiency.
Keywords
Auxiliary information Local pivotal method Permanent cluster Relative efficiency Sampling design Semi-permanent cluster1 Introduction
National forest inventories (NFIs) are the main source of information for characterizing the state of the forest resources (Vidal et al. 2016, p. 8). The most common inventory variables are forest area, mean growing stock volume, and distribution of growing stock volume into tree species and timber assortments (Tomppo et al. 2010; Vidal et al. 2016). In addition to the current growing stock, estimating the changes in the forests over time is important. The plots can be permanent, meaning they are remeasured in all consecutive inventory rounds, or temporary, meaning they are discarded after the first measurements. Temporary plots are mainly intended to capture the current state of the forest, whereas permanent plots in addition to the current state aim at capturing the changes (Scott 1998; Tomppo et al. 2010). Even though the increments of growing stock can be accurately measured via increment cores from temporary plots, estimating the changes such as natural mortality and harvests is much more precise from permanent than from temporary plots (e.g., Päivinen and Yli-Kojola 1989). NFIs can be solely on temporary plots (e.g., Poland, Portugal, France, and Spain), solely on permanent plots (e.g., Austria, Iceland, China, and Canada), or a combination of these two plot types (e.g., Finland, Sweden, Netherlands, Estonia, New Zealand; see Tomppo et al. 2010). The designs also change constantly in time, for instance in France, plans to introduce permanent plots have been reported (Vidal et al. 2016).
An inventory with purely permanent plots is called continuous forest inventory (CFI). Sometimes, the permanent plots established may lose their importance as an indicator for change. For example, treatment bias can be imparted when permanent plots are managed differently than the surrounding forests and affect CFI estimates (Köhl et al. 2015). In such occasion, the possibility to redistribute also the permanent plots would be beneficial.
Another option is a sampling design where the permanent plots are only used for a limited time, i.e., they are semi-permanent. Such designs are flexible since the priorities in survey may be changed from a round to another by allocation of different numbers of temporary plots (Scott and Köhl 1994). A semi-permanent plot is surveyed in at least two consecutive inventory rounds but then relocated like a temporary plot. Therefore, it is capable of capturing change and, in addition, with an efficient reallocation, is not susceptible to the treatment bias in the same way as permanent plots. An example of this is sampling with partial replacement (e.g., Patterson 1950; Matis et al. 1984; Köhl et al. 1995).
The measurement costs of permanent plots have been higher than those of temporary plots, due to necessity of making sure the plot is found for remeasurements. However, with modern GPS, the trees in temporary plots may be located as accurately as the trees in permanent plots, and therefore, the measurement costs do not differ markedly any more (e.g., Tomppo et al. 2014). This makes it possible to introduce new permanent plots without additional costs.
In Sweden, the temporary clusters in the current NFI round, which began in summer 2018, were chosen with spatially balanced sampling using local pivotal method (LPM) in the sample selection (Grafström et al. 2017b). This has motivated us to test the same method in the Finnish NFI setting. In a spatially balanced sampling, the distribution of the auxiliary variables in the sample is matched as closely as possible to the distribution in the entire population (Grafström et al. 2012). Auxiliary data may be any data available for all units of the population with no upper limit for the number of auxiliary variables used. Typically, auxiliary variables are spatial location, other geographic data such as altitude, and remotely sensed data (e.g., Grafström and Ringvall 2013; Grafström et al. 2014). The underlying assumption is that auxiliary information and inventory variables should be correlated (Grafström et al. 2012). LPM is a sample selection method resulting in approximately spatially balanced sample (Grafström and Lundström 2013). The LPM was assessed in a simulation study with independent auxiliary information and real NFI field data, where all sampling units belonged to one and the same population available for sampling (Räty et al. 2018). In other words, the setting in the study corresponded to an inventory with temporary inventory plots solely. The LPM can also be connected with other sampling methods such as stratification: in such a case, the LPM would be carried out separately within each stratum.
So far, there is no approach accounting for the distribution of existing permanent sample when selecting a temporary sample with the LPM. Such an approach should not compromise the requirement that each unit in the population has larger than zero probability to be included in the sample. To date with LPM, it has been only possible to match the distribution of the temporary sample irrespective of the existing permanent sample. Therefore, in the case of permanent sample, stratification with systematic or random sample selection may be more efficient than stratification with LPM or pure LPM. In stratified sampling, the sample within a stratum is populated first with the permanent sample belonging to that stratum. Then, the remaining sample within a given stratum is filled using systematic or random selection. Thus, while stratified sampling (with or without LPM) was shown to be less robust than pure LPM in our previous study (Räty et al. 2018), it may be more robust than LPM in a design involving permanent plots.
We assess in this study the efficiency of sampling designs by simulating the second phase of inventory sampling with different proportions of permanent clusters in the sample. Our first hypothesis is that as the proportion of permanent clusters in the sample increases, the relative efficiency (RE) of sampling design using LPM for temporary plot selection decreases, because a larger proportion of sample is chosen without utilizing the auxiliary information. In other words, with a larger proportion of permanent clusters, it is more difficult to match the distribution of a total sample including both temporary and permanent clusters to the distribution of auxiliary variables over the study region. As our second hypothesis, we assume that as the proportion of permanent clusters in the sample increases, the performance of stratified sampling designs with systematic plot selection compared with that of the LPM sampling improves. This is because the sampling units in the stratified sampling are selected independently of each other, without any need to account for the distribution of the permanent clusters in the same way as in LPM. In the last assessment, the permanent clusters are treated as semi-permanent, meaning all the semi-permanent clusters are resampled and allowed to change their position at the same time. In that case, both the entire temporary cluster population and semi-permanent cluster population are sampled with LPM using auxiliary variables. The setup could be thought as a maximal potential achievable with semi-permanent clusters. We assume that this semi-permanent/temporary sampling design would be more efficient than the design with permanent and temporary plots but not as efficient as the design where all clusters are temporary.
2 Material and methods
2.1 Study region and primary data
The study region is the southern part of Finland excluding the southwestern archipelago that covers about 153,000 km^{2} land area and two sampling regions (Fig. 1a). Primary data in this study are the field data from the 11th Finnish NFI (NFI11) which was carried out in years 2009–2013. The sample plots are arranged in the clusters with slightly different cluster designs for the sampling regions (Fig. 1b, c). Data comprise altogether 46,914 field sample plots in N = 5408 clusters of which 1082 clusters were permanent and the rest 4326 were temporary.
Reference levels in the study: the population-level values for chosen population parameters (first row) and the mean squared errors (MSEs) for local pivotal method with spatial coordinates (=geospatial spread) by increasing proportion of permanent clusters (p) with the sample size of n = 400. Unit of MSE is the squared unit of that variable
Proportion of forested land (%) | Total volume (Mill. m^{3}) | Mean volume (m^{3}/ha) | ||||
---|---|---|---|---|---|---|
Coniferous | Broadleaf | All tree species | ||||
Pine | Spruce | |||||
Population | 74.8 | 1553 | 59.6 | 48.1 | 28.3 | 136.0 |
MSE, p = 0.1 | 1.28 | 1.20E+15 | 2.80 | 3.76 | 1.24 | 5.80 |
MSE, p = 0.2 | 1.29 | 1.16E+15 | 2.73 | 3.71 | 1.15 | 5.78 |
MSE, p = 0.3 | 1.35 | 1.18E+15 | 2.78 | 3.54 | 1.13 | 5.59 |
MSE, p = 0.4 | 1.35 | 1.19E+15 | 2.86 | 3.62 | 1.13 | 5.57 |
MSE, p = 0.5 | 1.30 | 1.14E+15 | 2.84 | 3.64 | 1.08 | 5.48 |
MSE, p = 0.6 | 1.34 | 1.13E+15 | 2.71 | 3.46 | 1.07 | 5.39 |
MSE, p = 0.7 | 1.32 | 1.14E+15 | 2.70 | 3.60 | 1.06 | 5.39 |
2.2 Auxiliary information
Auxiliary information in this study was from the tenth multi-source NFI (MS-NFI10) (Tomppo et al. 2008), which was available as georeferenced raster layers of 20 × 20 m pixel size. These forest resource maps were based on the field measurements (NFI10 in years 2003–2008) and Landsat 5 TM images from year 2007 (Tomppo et al. 2012).
Thematic maps utilized in the study, cluster-level auxiliary variable description, and correlation between the auxiliary and primary data
Variable | Thematic map (s) | Description | Correlation^{4} |
---|---|---|---|
x _{1} | Mean volume of all tree species^{1} (m^{3}/ha) | Mean | 0.56 |
x _{2} | Mean pine volume^{1} (m^{3}/ha) | Mean | 0.54 |
x _{3} | Mean spruce volume^{1} (m^{3}/ha) | Mean | 0.64 |
x _{4} | Mean broadleaf volume^{1,2} (m^{3}/ha) | Mean | 0.49 |
x _{5} | Mean volume of all tree species^{1} (m^{3}/ha) | Variance | 0.37 |
x _{6} | Land class^{2} | Proportion of forested land | 0.67 |
2.3 LPM
In the selection process, the initial inclusion probabilities are turned into inclusion indicators, which are updated with an algorithm. However, while these indicators change during the process, the actual inclusion probabilities remain at the initial level. The updating is carried out using pairwise comparisons. Further, while the LPM algorithm is selecting a sample, the population divides into two: available and decided population. In the beginning, the entire population is available, i.e., all inclusion indicators differ from values 1 and 0. If the updated indicator value is 0, that unit will not be included in the sample and it is moved from the available population to the decided population. Similarly, a unit chosen to the sample and having an inclusion indicator value of 1 will also be moved to the decided population. As the selection proceeds, in every algorithm round, at least one unit is either chosen to the sample or loses its possibility to be included in the sample and thus is moved to the decided population. So, when LPM algorithm is running, the available population is diminishing and decided population consisting of included and excluded units is increasing. For more details of LPM, see, e.g., Grafström et al. (2012) and Fig. 2 in Räty et al. (2018).
2.4 Stratified sampling
In stratified sampling, the population is divided into as homogenous strata as possible using the auxiliary variables. We used equal-distanced limits along the cumulative distribution of the square root of the density function of auxiliary variable to define the strata (see Cochran 1977; section 5A.7). The sample size within each stratum was defined with optimal allocation where the within-stratum variance of auxiliary variable was weighted with the size of the stratum (Cochran 1977). In this study, we utilized the stratifications that proved to be most efficient and robust in our previous study (Räty et al. 2018).
Limits of strata in stratifications based on cluster-level auxiliary variables (see Table 2 for definitions)
Name | Stratum/limits | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
Vol4 | x_{1} < 87.4 | 87.4 ≤ x_{1} < 120.8 | 120.8 ≤ x_{1} < 159.4 | x_{1} ≥ 159.4 | – | – |
Vol5 | x_{1} < 79.6 | 79.6 ≤ x_{1} < 107.7 | 107.7 ≤ x_{1} < 135.2 | 135.2 ≤ x_{1} < 169.8 | x_{1} ≥ 169.8 | – |
SPVol2 | x_{4} < x_{2} + x_{3} (conifer-dominated) | x_{4} ≥ x_{2} + x_{3} (others) | ||||
x_{1} < 89.1 | 89.1 ≤ x_{1} < 122.2 | 122.2 ≤ x_{1} < 160.5 | x_{1} ≥ 160.5 | x_{1} < 78.5 | x_{1} ≥ 78.5 | |
SPVol1 | x_{4} < x_{2} + x_{3} (conifer-dominated) | x_{4} ≥ x_{2} + x_{3} (others) | – | |||
x_{1} < 89.1 | 89.1 ≤ x_{1} < 122.2 | 122.2 ≤ x_{1} < 160.5 | x_{1} ≥ 160.5 | |||
x_{5} < 3952.2 | x_{5} ≥ 3952.2 | x_{5} < 6001.8 | x_{5} ≥ 6001.8 | |||
VolSpr | x_{1} < 99.2 | 99.2 ≤ x_{1} < 145.0 | x_{1} ≥ 145.0 | |||
x_{3} < 19.7 | x_{3} ≥ 19.7 | x_{3} < 42.7 | x_{3} ≥ 42.7 | x_{3} < 271.4 | x_{3} ≥ 271.4 | |
FL%Vol | x_{6} < 0.36 | 0.36 ≤ x_{6} < 0.64 | 0.64 ≤ x_{6} < 0.86 | 0.86 ≤ x_{6} ≤ 1.00 | ||
x_{1} < 121.0 | x_{1} ≥ 121.0 | x_{1} < 119.5 | x_{1} ≥ 119.5 | |||
FL%Pi | x_{6} < 0.36 | 0.36 ≤ x_{6} < 0.64 | 0.64 ≤ x_{6} < 0.86 | 0.86 ≤ x_{6} < 1.00 | ||
x_{3} < 53.1 | x_{3} ≥ 53.1 | x_{3} < 56.1 | x_{3} ≥ 56.1 | |||
Con5 | x_{2} + x_{3} < 59.3 | 59.3 ≤ x_{2} + x_{3} < 85.3 | 85.3 ≤ x_{2} + x_{3} < 110.6 | 110.6 ≤ x_{2} + x_{3} < 144.9 | x_{2} + x_{3} ≥ 144.9 | – |
Con6 | x_{2} + x_{3} < 53.3 | 53.3 ≤ x_{2} + x_{3} < 77.4 | 77.4 ≤ x_{2} + x_{3} < 97.3 | 97.3 ≤ x_{2} + x_{3} < 120.2 | 120.2 ≤ x_{2} + x_{3} < 152.2 | x_{2} + x_{3} ≥ 152.2 |
Con3BL2 | x_{2} + x_{3} < 77.4 | 77.4 ≤ x_{2} + x_{3} < 120.2 | x_{2} + x_{3} ≥ 120.2 | |||
x_{4} < 25.8 | x_{4} ≥ 25.8 | x_{4} < 25.3 | x_{4} ≥ 25.3 | x_{4} < 25.5 | x_{4} ≥ 25.5 | |
Con3FL%2 | x_{2} + x_{3} < 77.4 | 77.4 ≤ x_{2} + x_{3} < 120.2 | x_{2} + x_{3} ≥ 120.2 | |||
x_{6} < 0.61 | x_{6} ≥ 0.61 | x_{6} < 0.68 | x_{6} ≥ 0.68FLC2 | x_{6} < 0.59 | x_{6} ≥ 0.59 | |
Pi3Spr2 | x_{2} < 40.6 | 40.6 ≤ x_{2} < 66.2 | x_{2} ≥ 66.2 | |||
x_{3} < 53.5 | x_{3} ≥ 53.5 | x_{3} < 51.7 | x_{3} ≥ 51.7 | x_{3} < 49.6 | x_{3} ≥ 49.6 | |
Spr3Pi2 | x_{3} < 32.4 | 32.4 ≤ x_{3} < 71.2 | x_{3} ≥ 71.2 | |||
x_{2} < 54.6 | x_{2} ≥ 54.6 | x_{2} < 54.3 | x_{2} ≥ 54.3 | x_{2} < 53.1 | x_{2} ≥ 53.1 |
The stratifications performed and both the sizes, N_{s}, and sample sizes, n_{s}, of strata. For a more detailed definition of auxiliary variables, see Table 3
Name | Stratifying variable(s) | Number of strata | N _{1} | N _{2} | N _{3} | N _{4} | N _{5} | N _{6} |
---|---|---|---|---|---|---|---|---|
n _{1} | n _{2} | n _{3} | n _{4} | n _{5} | n _{6} | |||
Vol4 | Volume x_{1} | 4 | 1027 | 1888 | 1718 | 775 | – | – |
93 | 103 | 108 | 96 | |||||
Vol5 | Volume x_{1} | 5 | 680 | 1458 | 1566 | 1192 | 512 | – |
71 | 82 | 88 | 83 | 76 | ||||
SPVol2 | Species group dominance^{1}/volume x_{1} | 6 (2/4,2)^{2} | 990 | 1869 | 1660 | 740 | 98 | 51 |
83 | 101 | 104 | 92 | 11 | 9 | |||
SPVol1 | Species group dominance/volume x_{1} | 5 (2/4,1)^{3} | 990 | 1869 | 1660 | 740 | 149 | – |
80 | 98 | 100 | 89 | 33 | ||||
VolSpr | Volume x_{1}/spruce volume x_{3} | 6 (3/2) | 978 | 671 | 1364 | 1125 | 746 | 524 |
95 | 34 | 82 | 67 | 52 | 70 | |||
FL%Vol | % forested x_{6}/volume x_{1} | 6 (4/1,1,2,2)^{4} | 813 | 1431 | 1183 | 825 | 733 | 423 |
121 | 120 | 39 | 40 | 40 | 40 | |||
FL%Pi | % forested x_{6}/pine volume x_{2} | 6 (4/1,1,2,2) | 881 | 1258 | 864 | 726 | 894 | 785 |
115 | 124 | 39 | 41 | 40 | 41 | |||
Con5 | Volume of conifers x_{2} + x_{3} | 5 | 754 | 1437 | 1577 | 1164 | 476 | – |
72 | 93 | 102 | 80 | 53 | ||||
Con6 | Volume of conifers x_{2} + x_{3} | 6 | 558 | 1118 | 1323 | 1221 | 843 | 345 |
56 | 78 | 88 | 78 | 58 | 42 | |||
Con3BL2 | Volume of conifers x_{2} + x_{3}/volume of broadleaves x_{4} | 6 (3/2) | 1061 | 615 | 1569 | 975 | 691 | 497 |
81 | 47 | 92 | 62 | 72 | 46 | |||
Con3FL%2 | Volume of conifers x_{2} + x_{3}/% forested x_{6} | 6 (3/2) | 721 | 955 | 931 | 1613 | 532 | 656 |
71 | 57 | 66 | 99 | 58 | 49 | |||
Pi3Spr2 | Pine volume x_{2}/spruce volume x_{3} | 6 (3/2) | 842 | 840 | 1625 | 833 | 940 | 328 |
62 | 80 | 102 | 66 | 66 | 24 | |||
Spr3Pi2 | Spruce volume x_{3}/pine volume x_{2} | 6 (3/2) | 448 | 570 | 2375 | 1108 | 705 | 202 |
32 | 60 | 157 | 86 | 51 | 14 |
As a result was a sample where the permanent plots were spatially as spread as possible in the whole test area and the temporary plots within each stratum. Standard stratified estimators were used to compute the estimates for population parameters from each stratified sample.
2.5 Design efficiency and sampling simulations
The hypotheses were tested in sampling simulations with the real NFI field clusters. In the simulations, the permanent and temporary NFI field clusters were put in one and the same population, i.e., the sampling population was N = 5408 clusters and sample size n = 400. Further, the clusters chosen for either set were not excluded from the selection of the other set. This corresponds to the situation where the two different cluster sets are spread independently from each other.
All simulations, analyses, and visualizations were made with R (R Core Team 2018). The LPM was performed with lpm1 function available in R package BalancedSampling (Grafström and Lisic 2018).
3 Results
Relative efficiencies of sampling designs with local pivotal method and stratified sampling when the proportion of permanent clusters out of n = 400 sample clusters is 0%, 10%, 30%, and 60% and permanent clusters were distributed systematically. Result for p = 0 is from a previous study (Räty et al. 2018)
LPM design | Relative efficiency | Stratification | Relative efficiency | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Auxiliary variables | ||||||||||||||||||
All tree species, x_{1} | Pine, x_{2} | Spruce, x_{3} | Broadleaf, x_{4} | Variance, x_{5} | Forested land, x_{6} | Forested land | Total volume | Mean volume | Forested land | Total volume | Mean volume | |||||||
Pine | Spruce | Broadleaf | All tree species | Pine | Spruce | Broadleaf | All tree species | |||||||||||
p = 0 | ||||||||||||||||||
X | X | 2.15 | 1.77 | 1.03 | 1.21 | 1.00 | 1.48 | Con3BL2 | 2.77 | 1.22 | 0.71 | 0.75 | 0.76 | 0.9 | ||||
X | X | X | 2.07 | 1.66 | 1.19 | 1.55 | 0.99 | 1.45 | Con3FL%2 | 2.00 | 1.50 | 0.96 | 1.06 | 0.92 | 1.15 | |||
X | X | X | 2.07 | 1.59 | 1.49 | 1.56 | 1.05 | 1.36 | Con5 | 1.03 | 1.33 | 1.02 | 1.10 | 0.90 | 1.25 | |||
X | X | X | 2.04 | 1.67 | 1.14 | 1.16 | 1.26 | 1.40 | Con6 | 1.02 | 1.32 | 1.03 | 1.08 | 0.95 | 1.24 | |||
X | X | X | 2.02 | 1.64 | 1.40 | 1.43 | 1.10 | 1.37 | Vol5 | 0.96 | 1.24 | 1.03 | 1.12 | 1.03 | 1.22 | |||
X | X | X | 2.00 | 1.67 | 1.12 | 1.26 | 0.99 | 1.41 | SPVol2 | 0.95 | 1.23 | 0.87 | 1.07 | 0.85 | 1.21 | |||
X | X | X | X | 2.00 | 1.58 | 1.42 | 1.46 | 1.29 | 1.34 | FL%Vol | 0.94 | 1.27 | 0.93 | 1.11 | 0.85 | 1.24 | ||
X | X | X | X | X | 1.88 | 1.60 | 1.40 | 1.51 | 1.29 | 1.40 | Vol4 | 0.92 | 1.23 | 0.87 | 1.05 | 0.82 | 1.20 | |
X | X | X | X | X | X | 1.81 | 1.58 | 1.42 | 1.51 | 1.28 | 1.40 | Pi3Spr2 | 0.91 | 1.26 | 0.91 | 1.06 | 0.85 | 1.21 |
p = 0.1 | ||||||||||||||||||
X | X | 1.90 | 1.49 | 1.03 | 1.17 | 1.09 | 1.28 | Con3FL%2 | 1.90 | 1.68 | 1.10 | 1.26 | 1.29 | 1.40 | ||||
X | X | X | 1.89 | 1.48 | 1.32 | 1.35 | 1.14 | 1.25 | FL%Vol | 1.59 | 1.13 | 0.96 | 1.12 | 1.16 | 1.18 | |||
X | X | X | 1.89 | 1.51 | 1.06 | 1.25 | 1.06 | 1.29 | FL%Pi | 1.49 | 0.87 | 1.19 | 0.94 | 1.14 | 0.93 | |||
X | X | X | 1.81 | 1.48 | 1.06 | 1.19 | 1.25 | 1.29 | Con6 | 1.11 | 1.55 | 1.13 | 1.29 | 1.18 | 1.38 | |||
X | X | X | 1.79 | 1.47 | 1.31 | 1.42 | 1.14 | 1.31 | Spr3Pi2 | 1.05 | 1.28 | 1.29 | 1.44 | 1.20 | 1.27 | |||
X | X | X | 1.76 | 1.50 | 1.07 | 1.36 | 1.09 | 1.31 | Con5 | 1.03 | 1.51 | 1.11 | 1.26 | 1.22 | 1.44 | |||
X | X | X | X | X | 1.75 | 1.45 | 1.25 | 1.34 | 1.23 | 1.25 | Vol6 | 1.03 | 1.54 | 1.11 | 1.28 | 1.21 | 1.47 | |
X | X | X | X | 1.71 | 1.45 | 1.29 | 1.38 | 1.26 | 1.28 | Vol5 | 1.01 | 1.51 | 1.00 | 1.17 | 1.21 | 1.42 | ||
X | X | X | X | X | X | 1.71 | 1.46 | 1.29 | 1.34 | 1.20 | 1.30 | Pi3Spr2 | 0.97 | 1.32 | 1.38 | 1.39 | 1.30 | 1.30 |
p = 0.3 | ||||||||||||||||||
X | X | X | 1.59 | 1.34 | 1.08 | 1.25 | 0.99 | 1.19 | Con3FL%2 | 1.99 | 1.68 | 1.07 | 1.17 | 1.16 | 1.36 | |||
X | X | X | 1.59 | 1.37 | 1.19 | 1.21 | 1.06 | 1.23 | FL%Vol | 1.65 | 1.09 | 0.92 | 0.99 | 1.06 | 1.10 | |||
X | X | 1.58 | 1.37 | 1.05 | 1.10 | 1.15 | 1.22 | FL%Pi | 1.60 | 0.91 | 1.17 | 0.89 | 1.06 | 0.91 | ||||
X | X | X | 1.57 | 1.34 | 1.21 | 1.24 | 1.04 | 1.17 | Spr3Pi2 | 1.11 | 1.31 | 1.25 | 1.33 | 1.13 | 1.16 | |||
X | X | X | X | 1.56 | 1.33 | 1.20 | 1.23 | 1.11 | 1.17 | Con5 | 1.09 | 1.45 | 1.09 | 1.19 | 1.13 | 1.34 | ||
X | X | 1.56 | 1.31 | 0.99 | 1.06 | 1.03 | 1.16 | Vol6 | 1.08 | 1.47 | 1.07 | 1.14 | 1.18 | 1.42 | ||||
X | X | X | X | X | 1.52 | 1.33 | 1.17 | 1.21 | 1.12 | 1.21 | Con6 | 1.07 | 1.48 | 1.05 | 1.12 | 1.11 | 1.34 | |
X | X | X | 1.50 | 1.35 | 1.05 | 1.12 | 0.98 | 1.17 | Vol5 | 1.07 | 1.50 | 0.99 | 1.11 | 1.08 | 1.34 | |||
X | X | X | X | X | X | 1.50 | 1.34 | 1.17 | 1.22 | 1.13 | 1.17 | SPVol1 | 1.02 | 1.32 | 0.99 | 1.14 | 1.00 | 1.30 |
p = 0.6 | ||||||||||||||||||
X | X | X | 1.29 | 1.21 | 1.02 | 1.08 | 1.09 | 1.13 | FL%Vol | 2.21 | 1.22 | 0.88 | 0.89 | 0.90 | 1.09 | |||
X | X | X | 1.28 | 1.15 | 1.07 | 1.11 | 1.05 | 1.09 | FL%Pi | 2.15 | 1.05 | 1.18 | 0.76 | 0.93 | 0.87 | |||
X | X | 1.27 | 1.20 | 0.99 | 1.07 | 0.97 | 1.15 | Con3FL%2 | 1.91 | 1.54 | 1.01 | 1.14 | 1.06 | 1.25 | ||||
X | X | X | X | X | 1.25 | 1.16 | 1.05 | 1.16 | 1.09 | 1.10 | Con6 | 1.09 | 1.44 | 1.06 | 1.17 | 1.04 | 1.27 | |
X | X | X | X | X | X | 1.24 | 1.16 | 1.06 | 1.12 | 1.06 | 1.07 | Con5 | 1.05 | 1.36 | 1.04 | 1.10 | 1.05 | 1.26 |
X | X | X | 1.23 | 1.13 | 1.14 | 1.08 | 1.09 | 1.08 | SPVol1 | 1.04 | 1.27 | 0.94 | 1.06 | 0.96 | 1.24 | |||
X | X | X | 1.23 | 1.13 | 1.03 | 1.10 | 1.01 | 1.10 | Vol5 | 1.04 | 1.41 | 0.97 | 1.04 | 1.03 | 1.31 | |||
X | X | X | 1.23 | 1.13 | 1.00 | 1.08 | 1.02 | 1.09 | Spr3Pi2 | 1.03 | 1.21 | 1.22 | 1.31 | 1.04 | 1.13 | |||
X | X | X | X | 1.22 | 1.13 | 1.14 | 1.13 | 1.08 | 1.09 | Pi3Spr2 | 1.03 | 1.23 | 1.33 | 1.29 | 1.07 | 1.19 |
For the tree species–specific mean growing stock volumes, we were able to observe three phenomena: First, the decreasing trend as a function of increasing proportion of permanent clusters was not as obvious as for the other parameters. Second, for all tree species–specific mean growing stock volumes, the RE was larger if the auxiliary information included the tree-specific variables. Third, the clear differences in the REs between the cases using different auxiliary variables with small proportions of permanent clusters vanished as the proportion of permanent clusters increased. In the end, the REs were the same despite the auxiliary variables included in the sample selection.
4 Discussion
The aim of this study was to estimate the efficiency of spatially balanced and stratified sampling designs in a realistic NFI situation. Spatially balanced sampling used LPM in sample selection, and it was applied in two different setups: In the first setup, the field clusters were divided into permanent clusters being surveyed in the consecutive inventories and temporary clusters, which were measured only once. The location of permanent clusters was fixed and arranged spatially systematically in consecutive inventories, but the temporary clusters were reallocated inside the study region each simulation round with LPM utilizing remote sensing data from previous inventory. In the second setup, the cluster groups were semi-permanent and temporary; thus, both cluster populations were reallocated each simulation round with LPM utilizing similar auxiliary remote sensing data. In the first setup above, also the stratified sampling method was assessed. In all simulations, the sample size was fixed but different proportions of samples were allocated into the two cluster populations.
We estimated the sampling efficiencies using the fixed positions and designs of sample clusters from the previous inventories with total sampling intensity of 400/5408 ≈ 7.4%. A sampling design that is systematically placed should capture all the variation in the population, and the small sampling intensity guarantees that differences in design efficiency result from actual performance of the methods. Efficiencies of different sampling designs were studied in respect to the design where both cluster sub-populations were geospatially spread with LPM, which means that samples in sub-populations were close to a current systematic sampling design.
Our first hypothesis concerning the RE of LPM held. The RE of sampling designs decreased as the proportion of permanent clusters in the sample increased the sampling simulations (Fig. 3). In a previous study (Räty et al. 2018), where all the clusters were chosen with LPM from one population, the largest REs were 1.77 and 2.15 for total growing stock volume and forested land proportion estimation, respectively (Table 5). As the proportion of permanent clusters increased to 60% of the sample, the REs decreased even as much as 40% (Table 5). Nevertheless, as in the previous study (Räty et al. 2018), LPM was producing similar results irrespective of the auxiliary variables chosen, but in stratification, the result depended heavily on the chosen stratification strategy (Figs. 3 and 4, Table 5). Thus, with LPM, the estimation of a given tree species–specific mean growing stock volume was enhanced if the growing stock volume for that species was included in the auxiliary information given to LPM (Table 5).
The proportion of permanent clusters in the Finnish NFI is 60% (Kangas et al. 2018). Based on this study, sampling with LPM would enhance the estimation in respect to the current systematic sampling design, but the expected improvements are smaller than the previous studies (Grafström et al. 2017b; Räty et al. 2018) anticipated, being at 5–25% for different population parameters when the proportion of permanent plots in the sample is 60%. The question is whether the improvements gained at design phase with LPM would contribute enough in contrast to the other existing methods like post-stratification or model-assisted estimation methods (Haakana et al. accepted; Särndal et al. 1992; Kangas et al. 2016; Myllymäki et al. 2017) applied in the estimation phase to the results from systematic sampling design. Possibly, the most efficient approach would then be a combination of cluster-level LPM and plot-level post-stratification.
Relative efficiency of sampling designs when both temporary and permanent clusters were chosen with local pivotal method utilizing auxiliary information
LPM design | Relative efficiency | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Auxiliary variables | |||||||||||
All tree species, x_{1} | Pine, x_{2} | Spruce, x_{3} | Broadleaf, x_{4} | Variance, x_{5} | Forested land, x_{6} | Forested land | Total volume | Mean volume | |||
Pine | Spruce | Broadleaf | All tree species | ||||||||
p = 0.1 | |||||||||||
X | X | 2.11 | 1.85 | 1.08 | 1.39 | 1.27 | 1.57 | ||||
X | X | X | 2.08 | 1.86 | 1.56 | 1.60 | 1.42 | 1.57 | |||
X | X | X | 2.08 | 1.91 | 1.29 | 1.77 | 1.27 | 1.60 | |||
X | X | X | 2.06 | 1.83 | 1.16 | 1.45 | 1.21 | 1.53 | |||
X | X | X | 2.04 | 1.78 | 1.53 | 1.82 | 1.45 | 1.52 | |||
X | X | X | 2.00 | 1.82 | 1.17 | 1.37 | 1.60 | 1.53 | |||
X | X | X | X | X | 1.98 | 1.79 | 1.48 | 1.70 | 1.63 | 1.52 | |
X | X | X | X | 1.89 | 1.80 | 1.52 | 1.70 | 1.60 | 1.49 | ||
X | X | X | X | X | X | 1.83 | 1.76 | 1.49 | 1.74 | 1.61 | 1.57 |
p = 0.3 | |||||||||||
X | X | 2.19 | 1.83 | 1.04 | 1.29 | 1.12 | 1.49 | ||||
X | X | X | 2.11 | 1.81 | 1.19 | 1.54 | 1.15 | 1.51 | |||
X | X | X | 2.09 | 1.76 | 1.50 | 1.49 | 1.25 | 1.45 | |||
X | X | X | 2.08 | 1.86 | 1.17 | 1.32 | 1.51 | 1.52 | |||
X | X | X | 2.03 | 1.68 | 1.47 | 1.65 | 1.24 | 1.40 | |||
X | X | X | 1.98 | 1.70 | 1.13 | 1.33 | 1.14 | 1.38 | |||
X | X | X | X | 1.96 | 1.75 | 1.47 | 1.54 | 1.51 | 1.43 | ||
X | X | X | X | X | 1.93 | 1.73 | 1.47 | 1.60 | 1.42 | 1.47 | |
X | X | X | X | X | X | 1.81 | 1.66 | 1.46 | 1.59 | 1.44 | 1.44 |
p = 0.6 | |||||||||||
X | X | 2.09 | 1.70 | 1.02 | 1.22 | 1.10 | 1.39 | ||||
X | X | X | 2.07 | 1.62 | 1.45 | 1.48 | 1.18 | 1.38 | |||
X | X | X | 2.05 | 1.57 | 1.44 | 1.52 | 1.15 | 1.30 | |||
X | X | X | 2.04 | 1.73 | 1.19 | 1.54 | 1.15 | 1.49 | |||
X | X | X | 1.98 | 1.63 | 1.14 | 1.22 | 1.38 | 1.39 | |||
X | X | X | X | 1.96 | 1.59 | 1.35 | 1.53 | 1.39 | 1.35 | ||
X | X | X | 1.93 | 1.64 | 1.07 | 1.31 | 1.04 | 1.37 | |||
X | X | X | X | X | 1.93 | 1.67 | 1.40 | 1.55 | 1.39 | 1.40 | |
X | X | X | X | X | X | 1.80 | 1.60 | 1.36 | 1.54 | 1.31 | 1.40 |
Both datasets MS-NFI10 and NFI11 had their own error sources and inaccuracies, for example, locating the sample plots in NFI and MS-NFI has inaccuracy of some meters and therefore given center coordinates might have fallen into an adjacent pixel in the MS-NFI auxiliary data raster with pixel size of 20 × 20 m (Tomppo et al. 2012). On top of that, sample plots of changing radius (angle-gauge measurements) of maximum of 12.52 m are usually spreading to the adjacent pixels (Korhonen et al. 2017). Therefore, we did not choose only the pixel where the sample plot center falls in but also the adjacent pixels for auxiliary information estimation. Thus, we could be surer that we had chosen a pixel which describes the conditions of the sample plot, but at the same time, we also could have included pixels which were describing the forest conditions of adjacent stands. These position errors may decrease the RE, but do not cause bias.
The auxiliary variables were defined at cluster level which brought challenges to the simulation. Having all auxiliary information aggregated from single pixels to mean values for clusters faded the extremes of the multi-dimensional auxiliary variable distribution. This was compensated by adding an auxiliary variable which describes the amount of within cluster variation, i.e., variance of total growing stock volume, but the variation could also originate from distribution of land use or tree species compositions as well as from other site conditions like altitude, which were not included as auxiliaries in our study. Instead of mean values, different kinds of metrics to describe the distances between the clusters in the multi-dimensional auxiliary space or the variation and distribution of auxiliary variables within the clusters as well as variance estimator (Grafström and Schelin 2014; Grafström et al. 2017a; Grafström and Matei 2018) could also be studied further.
5 Conclusion
Increasing proportion of permanent sample plots did not have an effect on the RE of stratified sampling designs, though their result depended on the chosen variables used in stratification. Contrarily, with spatially balanced sampling designs, the REs decreased being about 10% for the mean growing stock when proportion of permanent plots in the sample increased to 60%. When permanent plots were changed to semi-permanent plots which were, instead of using systematic sampling, allocated in the similar manner as temporary plots, i.e., based on auxiliary information, the loss in RE experienced in spatially balanced designs disappeared. Therefore, the result challenges to consider sampling strategies with shorter term permanent sample plots which, however, might not be optimal regarding long-term changes, e.g., the effect of forest management on the forest structure. On the other hand, further development of spatially balanced sampling methods could also solve the problem how to take into account the permanent sample when selecting temporary sample.
Notes
Acknowledgements
Open access funding provided by Natural Resources Institute Finland (LUKE).
Funding
This study was funded by the Ministry of Agriculture and Forestry of Finland key project “Puuta liikkeelle ja uusia tuotteita metsästä” (“Wood on the move and new products from forest”).
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
References
- Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, New York, NYGoogle Scholar
- FAO (2012) Forest resources assessment 2015: Terms and Definitions. In: FAO Rep. http://www.fao.org/docrep/017/ap862e/ap862e00.pdf. Accessed 1 Feb 2019
- Grafström A, Lisic J (2018) Package “BalancedSampling” [online]. http://www.antongrafstrom.se/balancedsampling
- Grafström A, Lundström NLP (2013) Why well spread probability samples are balanced. Open J Stat 03:36–41. https://doi.org/10.4236/ojs.2013.31005 CrossRefGoogle Scholar
- Grafström A, Matei A (2018) Spatially balanced sampling of continuous populations. Scand J Stat. https://doi.org/10.1111/sjos.12322
- Grafström A, Ringvall AH (2013) Improving forest field inventories by using remote sensing data in novel sampling designs. Can J For Res 43:1015–1022. https://doi.org/10.1139/cjfr-2013-0123 CrossRefGoogle Scholar
- Grafström A, Schelin L (2014) How to select representative samples. Scand J Stat 41:277–290. https://doi.org/10.1111/sjos.12016 CrossRefGoogle Scholar
- Grafström A, Lundström NLP, Schelin L (2012) Spatially balanced sampling through the pivotal method. Biometrics 68:514–520. https://doi.org/10.1111/j.1541-0420.2011.01699.x CrossRefPubMedGoogle Scholar
- Grafström A, Saarela S, Ene LT (2014) Efficient sampling strategies for forest inventories by spreading the sample in auxiliary space. Can J For Res 44:1156–1164. https://doi.org/10.1139/cjfr-2014-0202 CrossRefGoogle Scholar
- Grafström A, Schnell S, Saarela S et al (2017a) The continuous population approach to forest inventories and use of information in the design. Environmetrics 28:e2480. https://doi.org/10.1002/env.2480 CrossRefGoogle Scholar
- Grafström A, Zhao X, Nylander M, Petersson H (2017b) A new sampling strategy for forest inventories applied to the temporary clusters of the Swedish NFI. Can J For Res 47:1161–1167. https://doi.org/10.1139/cjfr-2017-0095 CrossRefGoogle Scholar
- Haakana H, Heikkinen J, Katila M, Kangas A (2019) Efficiency of post-stratification for a large-scale forest inventory – case Finnish NFI. Ann For Sci 76:9. https://doi.org/10.1007/s13595-018-0795-6
- Kangas A, Myllymäki M, Gobakken T, Naesset E (2016) Model-assisted forest inventory with parametric, semiparametric, and nonparametric models. Can J For Res 46:855–868. https://doi.org/10.1139/cjfr-2015-0504 CrossRefGoogle Scholar
- Kangas A, Astrup R, Breidenbach J et al (2018) Remote sensing and forest inventories in Nordic countries – roadmap for the future. Scand J For Res 33:394–412. https://doi.org/10.1080/02827581.2017.1416666
- Köhl M, Scott CT, Zingg A (1995) Evaluation of permanent sample surveys for growth and yield studies: a Swiss example. For Ecol Manag 71(3):187–194CrossRefGoogle Scholar
- Köhl M, Scott CT, Lister AJ et al (2015) Avoiding treatment bias of REDD+ monitoring by sampling with partial replacement. Carbon Balance Manag 10(11):1–11. https://doi.org/10.1186/s13021-015-0020-y CrossRefGoogle Scholar
- Korhonen KT, Ihalainen A, Ahola A et al (2017) Suomen metsät 2009–2013 ja niiden kehitys 1921–2013 [online]. Luonnonvara- ja biotalouden tutkimus 59/2017. Luonnonvarakeskus, Helsinki, p 86Google Scholar
- Lloyd C (2009) Spatial data analysis - an introduction for GIS users. Oxford University Press, OxfordGoogle Scholar
- Matis KG, Hetherington JC, Kassab JY (1984) Sampling with partial replacement — an literature review. Commonw For Rev 63:193–206Google Scholar
- Myllymäki M, Gobakken T, Naesset E, Kangas A (2017) The efficiency of poststratification compared with model-assisted estimation. Can J For Res 47:515–526. https://doi.org/10.1139/cjfr-2016-0383 CrossRefGoogle Scholar
- Päivinen R, Yli-Kojola H (1989) Permanent sample plots in large-area forest inventory. Silva Fenn 23:243–252CrossRefGoogle Scholar
- Patterson HD (1950) Sampling on successive occasions with partial replacement of units. J R Stat Soc Ser B Methodol 12(2):241–255Google Scholar
- R Core Team (2018) The R Project for Statistical Computing. https://www.r-project.org/. Accessed 14 Dec 2018
- Räty M, Heikkinen J, Kangas AS (2018) Assessment of sampling strategies utilizing auxiliary information in large-scale forest inventory. Can J For Res 48:749–757. https://doi.org/10.1139/cjfr-2017-0414 CrossRefGoogle Scholar
- Särndal C-E, Swensson B, Wretman J (1992) Model assisted survey sampling. Springer-Verlag Publishing, New York, NYCrossRefGoogle Scholar
- Scott CT (1998) Sampling methods for estimating change in forest resources. Ecol Appl 8:228–233. https://doi.org/10.1890/1051-0761(1998)008[0228:SMFECI]2.0.CO;2Google Scholar
- Scott CT, Köhl M (1994) Sampling with partial replacement and stratification. For Sci 40:30–46. https://doi.org/10.1093/forestscience/40.1.30 CrossRefGoogle Scholar
- Tomppo E, Haakana M, Katila M, Peräsaari J (2008) Multi-source national forest inventory - methods and applications. In: Series: Managing Forest Ecosystems 18. Springer, BerlinGoogle Scholar
- Tomppo E, Gschwantner T, Lawrence M, McRoberts RE (eds) (2010) National forest inventories: pathways for common reporting. Springer, BerlinGoogle Scholar
- Tomppo E, Heikkinen J, Henttonen HM et al (2011) Designing and conducting a forest inventory - case: 9th National Forest Inventory of Finland. Springer, NetherlandsCrossRefGoogle Scholar
- Tomppo E, Katila M, Mäkisara K, Peräsaari J (2012) The Multi-source National Forest Inventory of Finland –methods and results 2007 [online]. Work Pap Finnish For Res Inst 233. http://www.metla.fi/julkaisut/workingpapers/2012/mwp227.pdf
- Tomppo E, Malimbwi R, Katila M et al (2014) A sampling design for a large area forest inventory: case Tanzania. Can J For Res 44:931–948. https://doi.org/10.1139/cjfr-2013-0490 CrossRefGoogle Scholar
- Vidal C, Alberdi IA, Hernández Mateo L, Redmond JJ (eds) (2016) National forest inventories - assessment of wood availability and use, 1st edn. Springer International Publishing, ChamGoogle Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.