Efficiency of post-stratification for a large-scale forest inventory—case Finnish NFI

Post-stratification based on remotely sensed data is an efficient method in estimating regional-level results in the operational National Forest Inventory. It also enables calculating the results accurately for smaller areas than with the default method of using the field plots only. The utilization of auxiliary information in survey sampling through model-assisted estimation or post-stratification has gained popularity in forest inventory recently. However, post-stratification at a large scale involves practical concerns such as the availability of auxiliary data independent of the sample at hand, and a large number of variables for which the results are needed. We assessed the efficiency of two different types of post-stratification, either post-stratifying for each variable of interest separately or using one post-stratification for all variables, compared to the estimation based on the field sample plots only. In addition, we examined the precision of area and volume estimates, and the efficiency of post-stratification at different spatial scales. For post-stratification, we used the volume maps based on Landsat satellite imagery, digital map data, and the sample plot data of the previous inventory. The efficiencies of post-stratifications based on the mean volume and the mean volumes by tree species were compared. In estimating the total volume, the relative efficiency of post-stratification compared to field plot based estimation was 1.54–3.54 over the provinces in South Finland. In estimating the volumes by tree species groups, the relative efficiency was 0.93–2.39. The gain with a separate stratification compared to the stratification based on total mean volume for all variables was at largest 0.69. In the small test areas, the relative standard errors of the total volume estimates decreased on average by 33% by using post-stratification instead of sample plots only. The mean relative efficiency was 2.36. The utilization of an old forest resources map and post-stratification based on the mean volume is an operational approach for the National Forest Inventory. Post-stratification also enables calculating the results accurately for markedly smaller areas than with the field plots only. Post-stratification reduced the probability of very high sampling variances, making the results more robust.


Introduction
The utilization of remotely sensed data as auxiliary information in forest inventory can markedly improve the precision of the estimates. One option for utilizing the auxiliary information is the model-assisted (MA) framework (Särndal et al. 1992). In MA estimation, while the model is used for improving the precision, the inference on the precision of the results is solely based on the design of the sample. Even if the model is incorrect, the results are not biased. Another option is the model-based (MB) approach, where the model is not only used for improving the precision, but the whole inference depends on it (Cassel et al. 1977). In the case of MB, an incorrect model will yield biased results. MB approach is often used for small-area estimation, but the problems in assessing the potential bias hinder the use of this method (Magnussen et al. 2016).
Another option for the utilization of auxiliary data is the poststratification (PS), where the sample is stratified after the data is collected (Holt and Smith 1979). PS can be considered to be a special case of MA, where a model with categorical explanatory variable(s) is utilized. Similar models could also be used in MB approach. All of these methods have gained popularity in forest inventory in recent years (e.g., McRoberts et al. 2002;Gregoire et al. 2011;Ståhl et al. 2011;McRoberts et al. 2012;Dahlke et al. 2013;Tipton et al. 2013;Saarela et al. 2015aSaarela et al. , 2015b. In many cases, PS is based on classifying the predictions of a model with continuous explanatory variables. MAwith continuous independent variables is more efficient than MA with a low number of classes (i.e., PS), but with a reasonable number of strata (4-8) the two approaches are very close to each other with respect to precision (Myllymäki et al. 2017). This is reasonable as Myllymäki et al. (2017) carried out PS using the predictions from the same linear model that was also used in MA with continuous variables. With a high enough number of strata, PS can seem even more efficient than MA, but a large number of strata easily introduce other problems such as empty strata. McRoberts et al. (2017) used a k nearest neighbor (knn) model in a comparison, and in their case study, MA with continuous variables produced 10-14% lower standard errors than PS for all considered numbers of strata (4-8). In PS, an empty stratum can be dealt with aggregating it into a larger stratum, which results in a less efficient PS than would have been expected without. However, if post-strata are based on a model, Cochran (1977, p. 133) indicates that there is little reduction in variance to be expected when more than six strata are used.
While the gains from either MA or PS are evident, the utilization of auxiliary information in a large-scale forest inventory is not necessarily straightforward in practice. One reason for this is that the number of variables of interest in forest inventory is usually very large (over 100 in the Finnish National Forest Inventory (NFI)), and modeling each variable of interest separately may be impractical (Opsomer et al. 2007). As the gains are dependent on the correlation between the variable of interest and the auxiliary variables, receiving full gains from the auxiliary information requires selecting the best combination of auxiliary variables separately for each variable of interest. It is possible to use PS based on a model estimated for one variable, say total volume or biomass, for all variables. However, that will introduce losses in efficiency. In MA approach, the use of one generic model form for all variables would be a feasible option but would also introduce losses in efficiency related to modeling each variable independently.
Another concern is the availability of an external model for MA or PS. If the model used in PS is constructed from the sample at hand (i.e., internal), it is called endogenous poststratification (Breidt and Opsomer 2008). Magnussen et al. (2015) and Kangas et al. (2016) showed in a simulation study that such an approach may lead to a serious underestimation of variances. On the other hand, a model available from a previous inventory can serve as an external model (Kangas et al. 2016;Myllymäki et al. 2017). Thus, for utilizing an external model in the current inventory, independent information derived from remote sensing or other auxiliary data sources at the time of the previous inventory need to be available, for estimating the external model and predicting a stratifying variable.
If it is not possible to get high-quality data as wall-to-wall auxiliary data, it is possible to utilize two-phase or three-phase approach. For instance, acquisition costs for wall-to-wall airborne laser scanning (ALS) data may be prohibitive, but strip data may be available. Then, the ALS strips can serve as a first phase sample, and MA or PS can be applied within the strips (Saarela et al. 2015b;Ene et al. 2018). In an operational NFI, wall-to-wall data is preferable for practical reasons, as results are calculated for areas with a high variety in size.
A third possibility would be to utilize the auxiliary information already at the design phase, in stratification (e.g., Katila and Tomppo 2002;Tomppo et al. 2014) or using a type of balanced sampling, e.g., a local pivotal method (Grafström et al. 2017;Räty et al. 2018). Katila and Tomppo (2002) used pre-stratification in the estimation of inventory results for small areas. The knn estimation was employed by map strata, that is, all field plots within each map stratum were used for estimating the areas of land use classes and forest variables of the particular stratum. While pre-and post-stratification are closely related methods, pre-stratification may lead to unsatisfactory results for those variables that are not among the stratification criteria (Räty et al. 2018), and therefore, poststratification is more robust for a multi-purpose inventory such as NFI.
In some countries (e.g., Brazil, France, Netherland, Slovakia, Sweden, USA), auxiliary information is applied in NFI to improve efficiency (Tomppo et al. 2010). According to a questionnaire survey conducted by Barrett et al. (2016), such countries were 11% of the 45 respondents. Yet, in the Swedish NFI, official statistics are based solely on field plots, but small-area statistics are calculated by using design-based PS, where knn maps or other map products are used for stratification (Fridman et al. 2014). In many other countries like Finland and Norway, the methodology is being developed, but the methods have not been introduced to operational use (McRoberts et al. 2012(McRoberts et al. , 2014Kangas et al. 2018).
In the current Finnish NFI, all national and regional level results (i.e., for provinces) are calculated based solely on field plots. Results for smaller areas, such as municipalities, that do not contain a sufficient number of plots for precise estimates have been calculated applying the multi-source NFI (MS-NFI) method which utilizes satellite images, digital map data and NFI field data (Tomppo et al. 2008). The estimates are weighted averages or sums of the field plot variables. The weights for the field plots are defined employing the knn estimation method (Tomppo et al. 2008). However, the resulting estimates are then purely model-based, and field plots outside the target area are also used. This may introduce bias which is not estimable (see, however, Magnussen et al. 2014Magnussen et al. , 2016. Using PS in small-area estimation would enable design-based inference, rather than model-based inference (i.e., relying solely on the plots within the area for inference).
The aim of the current study was to assess the efficiency of PS estimation, compared to the field inventory, that is, estimation based on the sample plots only. First, we considered two different types of post-stratifications, either based on the predictions of the variable of interest, or the predictions for just one of the variables. We calculated the losses in efficiency due to not making separate post-stratification for all variables of interest. Secondly, we analyzed the effect of spatial scale on the efficiency of PS, given the current sampling design with a fixed number of clusters and sample plots per unit area. The aim was to explore what is the smallest size of a target area for which the NFI results could be calculated with an acceptable accuracy by using PS. In examining the results, the applicability of PS in the operational NFI was considered.

Study areas and field data
We used the sample plot data from the 11th NFI (NFI11) in Finland measured in 2009-2013. The study area covered 15 provinces in South Finland and included two sampling density regions (or geographical strata), where sampling designs were slightly different (Figs. 1 and 2) (Korhonen et al. 2017). The total area was 208,674 km 2 of which 73% on land dominated by forests (Table 1). The most common tree species were Scots pine (Pinus sylvestris) and Norway spruce (Picea abies (L.) Karst.), and birches (Betula spp.) and other deciduous trees (mainly Populus tremula L. and Alnus spp.) had a lower proportion. The number of NFI11 sample plots in the study area was 63,706, of which 46,914 were on land.
To estimate the effect of the size of an area on the precision of the results, we selected three test areas from both NFI sampling regions (Fig. 3). The areas were placed and delineated in a way that the number of clusters in each area was the same in both regions. The number of plots per clusters was fixed within each region but varied between the regions. The test areas were then divided into smaller areas so that the relation of the size and the number of clusters within the area remained constant (Fig. 2). The sizes of the smallest areas were 1/16 of the largest, and they had 16 times less NFI clusters than the largest areas (Table 2). Spruce accounted for the highest percentage of volume in the test areas in sampling region 1, whereas it was pine in sampling region 2 (Table 3).

Field measurements
The NFI11 field plots in the study areas were restricted angle count sample plots with a basal area factor (relascope factor) of 2.0 and a maximum radius of 12.52 m. All trees on forested land were measured for their diameter at breast height (dbh, 1.3 m), and every seventh tree was measured as a sample tree in more detail, for example, for height, age, and increment over the past 5 years (Korhonen et al. 2017). The stem volume of each sample tree was estimated by using volume functions (Laasasenaho 1982) based on dbh, tree height, and also the diameter at 6 m height if the tree was taller than 8 m. For tally trees, the form height (stem volume per basal area) was estimated as a weighted mean of the form heights of the most similar sample trees (Korhonen et al. 2017). For the estimation of total and mean volumes by regions, tree stem volumes on sample plots were converted to volumes per hectare represented by each tree (Korhonen et al. 2017).

Auxiliary information
MS-NFI maps based on the sample plot data of the 10th NFI of Finland (NFI10) from the years 2005-2008 and Landsat 5 TM images from 2007 (Tomppo et al. 2012) were used as auxiliary data. The full-coverage raster maps with 20 m pixels were produced by combining satellite images, digital map data, and the NFI sample plot data, and by using the knn method for estimating the volume of the growing stock (m 3 / ha) and the volumes by tree species groups (pine, spruce, birch, and other deciduous tree species) for each image pixel on the forestry land mask (Tomppo and Halme 2004;Tomppo et al. 2008). In the estimation, digital map data from the National Land Survey of Finland (NLS) was used to separate forestry land from other land use classes, such as agricultural land, roads, built-up areas, and waterbodies, i.e., to create a forestry land mask. Note, however, that the MS-NFI volume estimates were calculated for the pixels on the forestry land mask using all the sample plots classified as forestry land in the field independently of the map data value.
In post-stratification, we used the MS-NFI volume maps: the mean volume of the whole growing stock and the mean volumes of pine, spruce, birch, and other deciduous trees; and combinations of the other land-use classes from the NLS map data, in estimating results for the provinces. The same stratification based on the total mean volume was used in estimating the efficiency of PS in the areas of different sizes. Even though some of the plots were permanent, post-stratification was independent of NFI11 data in the sense that it did not use any NFI11 measurements. The dependence resulting from re-measured permanent plots does not introduce similar underestimation of variance as endogenous post-stratification (Kangas et al. 2016).

Post-stratification
For a single variable y, the best characteristic for the construction of strata would be the distribution of y itself, or another variable x highly correlated with it (Cochran 1977, p. 127). When remotely sensed data are used as auxiliary information, the number of potential explanatory variables is usually very high. Then, PS can be based, for instance, on the predictionsŷ from a (linear or non-linear) model using some explanatory variables x (e.g., Magnussen et al. 2015).
Four volume strata were constructed with boundaries y i , i = 0,…,4, determined so that y 0 is the minimum and y 4 the maximum of predicted volumesŷ, and the integrals ∫ , are all equal, where f is the probability density of the predictions (Dalenius and Hodges Jr 1959;Cochran 1977, p. 127, Sect. 5A.7). This approach produces an optimal (Neyman) allocation of plots to the used strata. Other options are available, for instance dividing the range of predictions or the cumulative distribution to equal size intervals (Myllymäki et al. 2017). In addition, because the forestry land mask is not reliable enough, other land use classes were stratified to three strata: agricultural land, built-up area, and waterbodies. In the MS-NFI volume maps 2007, there was a small area covered by clouds on the west coast (Fig. 1). The pixels under the cloud were stratified to the waterbodies as the sea covered most

Etelä-Savo
Etelä-Karjala Etelä-Pohjanmaa of the area. The selection of four volume strata was based on previous experiences that little gain in precision was achieved with more than four to five Landsat-based strata (McRoberts et al. 2002;Nilsson et al. 2003;Katila 2006;McRoberts 2010;Magnussen et al. 2015).

Post-stratified estimators and their sampling variance
Forest areas were estimated bŷ where A h is the area of stratum h based on the MS-NFI volume map, and n h is the total number of NFI plots and n F,h the number of forest plots within that stratum, and total volumes byV where u h = ∑ k ∈ h w k fh k is the weighted sum of estimated form heights (volume per basal area) of trees k tallied in the plots of stratum h. In genuine relascope plots, weights w k would all be equal (i.e., basal area factor), but in plots with a maximum radius, large trees must be up-weighted; for further details, see Tomppo et al. (2011, Sect. 3.2.1). The stratum-specific estimators of area proportionsâ h ¼ n F;h =n h (in Eq. 1) and mean volumesv h ¼ u h =n F;h (in Eq. 2) are similar to ordinary NFI estimators, and their sampling variances were estimated as in the operational NFI, taking into account the clustering of sample plots and systematic sampling (Tomppo et al. 2011, Sect. 3.5.1). To be more precise, let us consider the generic form of (stratum-specific) NFI estimators, where y c,h and x c,h are cluster-level plot counts or sums. For area proportionsâ h , y c,h is the number of forest plots and x c,h is the total number of plots in cluster c that belong to stratum h. For mean volumesv h , y c,h is the weighted sum u h restricted to cluster c and x c,h is the number of those forest plots of cluster c that belong to stratum h. The sampling variance ofμ h was estimated by where z c;h ¼ y c;h −μ h x c;h is a cluster-level residual, g is a group of clusters close to each other, and the weights b c are determined so that each local quadratic form (∑ c ∈ g b c z c ) 2 is an unbiased estimator of the variance of z c 's (Matérn 1960, p. 110). By analogy to Eq. 3, the mutual covariances of estimatorŝ μ h of area proportions or mean volumes were estimated by where x c,h denotes the count or sum x c restricted to plots of stratum h. Finally, the sampling variances of post-stratified and the delta method was applied to approximate the variance of post-stratified estimators of mean volumes (Tomppo et al. 2011, Sect. 3.5.2): The relative efficiencies (RE) of post-stratified estimatorŝ Y PS of forested area, total volumes (m 3 ), and volumes by tree species groups (m 3 ) compared to their ordinary NFI estimatorŝ Y N FI based on the field data only were calculated as ratios of the =Ŷ was used as an absolute measure of sampling uncertainty. The estimation results are presented for Bforested land,^which consists of productive forest land (growth at least 1 m 3 /ha/a) and poorly productive forest land (growth at least 0.1 m 3 /ha/a) according to the national landuse classification and used in the NFI reporting. These national classes cover the global class of forest and, to a large extent, also other wooded land as defined by the Food and Agriculture Organization of the United Nations (FAO 2012).

Post-stratification
A total of 58.9% of the sample plots in South Finland were in the volume strata (1-4) ( Table 4) and the rest, 41.1%, in the strata of other land use classes (5-7). The percentages of plots in agricultural land, built-up land, and waterbodies were 10.4%, 4.5%, and 26.2%, respectively, and the same in all stratifications.
In the test areas of different sizes, the distribution of sample plots to the strata differed between the sampling regions. The area of other land use and waterbodies, and consequently, the percentage of sample plots in the strata 5-7 were higher in Southern Finland (38.1%) than in Central Finland (29.2%). Moreover, the test areas differed in a way that in Southern Finland, the plot distribution to the volume strata was more even. The percentage of plots in the stratum of highest volume (202 m 3 /ha and above) was 13.1%, whereas in Central Finland, only 6.3%.

Comparison of field estimation and post-stratification at province level
The use of PS based on the mean volume improved the precision of area and volume estimates clearly, compared to the NFI estimates based on the field inventory, i.e., sample plots only ( Table 5). The mean relative efficiency in estimating forested land area was 3.26 in South Finland (Table 6) and at highest in the provinces of Uusimaa and Varsinais-Suomi, 4.64 and 4.47, respectively. In estimating the total volume, the mean relative efficiency of PS was 2.32 and at highest 3.54 in the province of Keski-Pohjanmaa (Fig. 4, Table 6).
Also for the volumes by trees species groups, PS based on the total mean volume was more efficient than the field inventory, and slightly more if the mean volume of the tree species in question was used in the stratification (Table 6). PS based on the pine volume increased the efficiency at most by 0.58 in the province of Pohjois-Karjala, compared to the PS based on the total mean volume (Fig. 5). The gain in the relative efficiency was on average 0.26 (Table 6). For spruce volume, the gain in efficiency by using PS based on the spruce volume compared to PS based on the mean volume was largest in the province of Keski-Pohjanmaa, 0.69, and 0.28 on average (Fig. 6, Table 6). In estimating the total volume of birch, PS was slightly more efficient than the field inventory (1.14) in the whole South Finland, and PS based on the birch volume further improved the efficiency (1.34) (Fig.  4, Table 6). In estimating the volume of other deciduous tree species, PS and the field inventory were practically equally efficient in all provinces, and PS based on the mean volume of other deciduous trees hardly improved the efficiency (Fig. 4, Table 6).

Effect of spatial scale on the precision
In the test areas of different sizes, PS markedly improved the precision of area and volume estimates, compared to the field inventory. The mean relative standard error (SE) of the estimate for forested area decreased most in the smallest test areas, but the decrease was relatively highest in the largest test areas (Tables 7 and 8). For example, in the smallest areas (576 km 2 ) in sampling region 1, PS resulted in the mean SE 43% smaller than the mean SE with the field inventory, that is, the mean relative SE decreased from 7.81 to 4.43% (Table 7). In the largest areas (9216 km 2 ), the mean SE was decreased by 50% by using PS. In estimating the total volume, PS improved the precision by 28% in the smallest and 35% in the largest areas (Fig. 7, Table 9). In the sampling region 2, the improvements were 29% and 37%, respectively (Fig. 8, Table 10). With PS, the maximum values of SE were clearly lower than with the field inventory ( Figs. 7 and 8). In general, PS was more efficient in all test areas. The mean REs for the forested area were in the range 3.12-3.86 (Tables 7 and 8) and for the total volume 2.09-2.52 (Tables 9 and 10). However, in some of the smallest test areas, the minimum RE values were below one, meaning that the field inventory occasionally produced more precise results than PS.

Discussion
The most important result from this study was that even though the mean volume was used for stratification, the results for the provinces were fairly good with all tested variables. In particular, it is important that no deterioration of efficiency was observed: Even if the improvement in RE was modest with some variables, the precisions still improved. The only exception was the volume of other deciduous trees, for which RE was below one in some of the provinces (Fig. 4). In an operational NFI, it is very important to get good estimates for all the variables in question. In connection with this study, we also tested PS for a large set of other variables, such as forested area and volumes by development classes (not reported here), and no significant deterioration of results was observed with any of the variables. One important variable, volume of dead wood was not included in our tests, but according to Nilsson et al. (2003Nilsson et al. ( , 2005, PS improved also the precision of dead-wood volume estimates compared to those based on NFI field data alone. The relative efficiencies obtained in this study are consistent with the results of the previous studies on PS based on satellite imagery in forest inventories (Nilsson et al. 2003;Nilsson et al. 2005;McRoberts et al. 2006). The range of 1.54-3.53 in estimating the total volume over the provinces (Table 6) is similar to RE of 3.7 reported by Nilsson et al. (2003) and 1.1-1.7 by McRoberts et al. (2006). The lower REs in the North East states of USA were attributed to greater heterogeneity in forests there, characterized by naturally generated, mixed species and uneven-aged stands, than in Scandinavian countries (McRoberts et al. 2006). Nilsson et al. (2005) also concluded that stratification based on the stem volume of individual tree species did not improve the estimation precision of the variable in question, except the volume of deciduous trees. In estimating forested land area, the efficiency of PS was highest in the provinces of Uusimaa and Varsinais-Suomi, where the percentage of other land use area is high (Table 6). In estimating the pine and spruce volumes, PS based on the mean volume of the species in question further improved the efficiency only by 0.27 on average (Table 6). In the case of pine volume, the highest gains were obtained in the provinces of Varsinais-Suomi, Pohjois-Karjala, and Etelä-Pohjanmaa (Fig. 5), where the percentage of pine of the total volume was also high. In the case of spruce, the gain was highest in the small province of Keski-Pohjanmaa (Fig. 6), which was, however, not related to the dominance of spruce in the area (only 16% of the total volume). One reason for the low improvement of efficiency by using PS based on the mean volumes by tree species is the poor precision of knn predictions at the pixel level, about 60-80% for the mean volume and even over 100% for volumes by tree species (Katila and Tomppo 2001). In cross-validation tests by tree species groups, the proportion of the variation in the field plots explained by the knn predictions was largest for the spruce volume, and for the volumes of deciduous tree species, the explanatory power was poor (Katila and Tomppo 2001). These findings seem to reflect to the gains in the precision using PS in this study. However, stratification at the pixel level may lead to the assigning of erroneous strata to the sample plots and confusing the estimation with PS based on the volumes by tree species. One solution to overcome the problem of large pixel-level variance in knn predictions is segmentation of the volume maps or original Landsat images into homogenous units, and assigning each segment a stratum based on the mean volume within the segment in question (Nilsson et al. 2003(Nilsson et al. , 2005. Also other postprocessing methods, such as edge preserving smoothing, could be used for reducing within stand variation on MS-NFI volume maps (Tomppo 1996). In the operational NFI, segmentation or filtering would bring an additional processing phase, but a possible improvement in efficiency due to better compatibility of strata and plot variables is worth exploring.
In Finland, the distribution of forest land and heterogeneity of forests differ from south to north due to climatic conditions. The current NFI sampling is adapted to this large-scale spatial variability and designed to produce reliable results for regional administrative units, sampling density regions following unit borders (Tomppo et al. 2011). One option would be to stratify the sampling regions separately, which was also tested in connection with this study. However, the same PS for the whole of South Finland was applied because it resulted in efficiencies very close to those obtained with the separate PSs for the sampling regions 1 and 2. The separate PS improved the relative efficiency most, by 0.12 compared to PS for South Finland, in estimating the total volume in the province of Varsinais-Suomi. This indicates that the regions also differ from west to east, or from coast to inland. PS of a smaller area could improve results in some divergent regions, but applying the same PS for the whole South Finland is more operational.
The efficiency of PS, for example, the mean value of 2.32 in estimating total volume (Table 6), can be interpreted so that the number of sample plots could be decreased by 2.32 to achieve the same precision as by the field inventory. If the precision of NFI results with the current sampling size is regarded to be sufficient, expensive field measurements could be reduced by using PS in calculating forest statistics for national and regional levels. On the other hand, as the efficiency for some variables, like the volume of other deciduous species did not improve, decreasing the number of plots would reduce the precision for those estimates. Thus, significant cost savings may not be achieved if the precision of these variables needs to be at the current level.
With PS, the current sampling can be used to estimate results for smaller areas than with the field inventory. The goal in developing the Finnish NFI is to provide forest information for smaller areas than provinces. The results for the test areas of different sizes showed that with PS, the total volume could be estimated with the relative SE lower than 5% for an area of 2304 km 2 in Southern Finland and 1568 km 2 in Central Finland, whereas with the field inventory for areas double the sizes of these. For the volumes by tree species groups, the relative SEs would be higher, for example, in the area of 2304 km 2 in Southern Finland, and the average SEs were 7.9% for pine and 7.4% for spruce. It is notable that the small area estimates obtained with PS were robust, compared to the field plot based estimates, as the largest variances improved even more than the average variances.
PS with current field sampling is, however, not an adequate method to calculate results for all municipalities in South Finland, because their mean size is only 884 km 2 , ranging from the smallest urban municipality of 6 km 2 to extensive rural municipalities at most of 5548 km 2 . Using PS, estimates of forest variables can be, however, obtained for large municipalities, combinations of municipalities, or different areas of interest other than administrative units. Thus, in the smallest municipalities, the model-based inference would still be needed. In a previous study, Katila (2006) reported that with MS-NFI, that is, by means of knn, mean volume can be estimated  with a precision of 5% for areas of 100 km 2 and larger. The problem with knn is, however, the analytical inference.
Estimators have been developed (e.g., Baffetta et al. 2009;Magnussen 2013), and in the future, these new estimators need to be tested in the operational NFI setting and compared to the well-established methods such as post-stratification to address their potential. The improvements observed using PS were fairly good, given that the MS-NFI volume maps based on Landsat imagery several years older than the NFI11 data were used. The main reason for using the forest volume maps from an old inventory was to use independent auxiliary data and thus avoid the endogenous post-stratification. Compared to up-todate satellite images or laser scanning data, our approach is likely to produce a little less accurate results, i.e., the old data potentially reduces the efficiency. This is because between the two inventory rounds, large numbers of operations have been carried out in the forests, which reduce the correlation between the old MS-NFI map information and the current plot information. It is possible, however, to improve the results by updating the changes like clear-cut areas in the maps used for stratification. This can be done by using satellite images from  (Törmä et al. 2011). The changes due to growth and thinning would still be missing. Another, more laborious alternative would be to update the old NFI plots for the changes and use them and new RS material to produce an independent, but more up-todate stratification. The possible improvements of such approaches remain to be studied later.
On the other hand, by using the old, published MS-NFI volume maps, we avoid many other problems in more recent satellite images, such as missing data due to clouds. In the published MS-NFI 2007 maps, there was only a small area of missing data due to clouds, which was mainly water. The missing data may have slightly affected the results of the Pohjanmaa province. In MS-NFI, the relatively good coverage can be obtained because the maps are based on satellite images from one or more years. Generally, cloud gaps are patched with older images or images with poorer quality. In addition, satellite images have between-image differences, which affect the MA and PS results. In the MS-NFI forest resources maps, these differences have been somewhat reduced by the used knn approach, where plots only within the image area are utilized as reference data (Tomppo et al. 2008(Tomppo et al. , 2012: While the images differ, the estimated forest characteristics have the properties of the underlying forests. In our case, the off-the-shelf forest resource map makes PS fully operational from the start. MS-NFI maps are produced every other year for the whole country except Åland islands and Northern Lapland, which are covered less frequently (Metla 2013;Mäkisara et al. 2016). The Finnish NFI in turn is a continuous inventory where one round is completed in every 5 years. The  availability of MS-NFI maps would not be a problem, while PS would be relying on MS-NFI maps based on the field data of the previous NFI round.
In MA approaches, it is very important that the area of forest land is accurately known to avoid biased estimates. The erroneous forest area will introduce bias if, e.g., the model predicts high volumes or biomass to areas that is not forest at all, for instance, on waterbodies. In PS, the strata sizes are needed for weighting the strata, and incorrect weights result in less accurate estimates but not biased. Although bias can be avoided, it is important to have accurate map data also in PS to obtain precise results that are comparable to the official statistics, for example, on land use areas. In the NLS map data, areas of waterbodies are regarded as being accurate, whereas areas of other land uses are underestimated (Tomppo et al. 2008). In MS-NFI, i.e., in estimating forest statistics for municipalities by means of knn, the plot weights are corrected with a calibration method to reduce the errors due to confusion between forestry land and other land use classes in the field data and on the map (Katila et al. 2000). Also the accuracy of PS results could be improved by adjusting stratum areas according to calibrated municipality areas.

Conclusion
The main conclusion is that the utilization of an old forest resources map is a fully operational approach for nationallevel post-stratification. The efficiency improvements were surprisingly good, given that the correlations between Landsat images and forest resources are modest, and the map was several years old. PS clearly improved the precision of the forest estimates at the province level, compared to the estimates based on field plots only, for example, on average by 33% in the case of total volume. PS also enabled calculating the results accurately for markedly smaller areas than with the field plots. It is also very important that the post-stratified estimates for small areas were robust, meaning that the possibility of very poor estimates was clearly lower than with the estimates based on field plots only. The interest of using PS in small area estimation, for example, at the municipality level in Finland, is the design-based inference. This would be an advantage compared to the currently used model-based approach, MS-NFI (knn), which has a possibility of bias and lacks an analytical error estimator for areas of different sizes. The results of this study confirmed that with the current NFI sampling, PS could be used to estimate forest statistics accurately for large municipalities and combinations of municipalities.
In PS approach, it is critical that the strata are accurately delineated and the sample plots are assigned to correct strata. The precision of forest estimates could be further improved by using updated image material, for example, by updating regeneration cuttings to improve the correlation between the MS-NFI map data and current plot data. Because of the high errors of knn predictions at the pixel level, the precision of PS estimates could be potentially improved by using segmentation, i.e., post-stratifying homogenous segments derived from knn volume maps or original Landsat images. Moreover, the accuracy of PS results could be improved by adjusting stratum areas according to calibrated municipality areas, i.e., corrected for map errors in land use classes.  Contribution of the co-authors Haakana compiled the data, conducted the analyses, and wrote most of the manuscript. Heikkinen developed the estimators, consulted on the calculations, and wrote the methods concerning the estimators. Katila provided tools for post-stratification and consulted on the post-stratification. Kangas coordinated the research, drafted the manuscript, and wrote the introduction part. Heikkinen, Katila, and Kangas commented and revised the manuscript.
Funding The study was conducted as a part of the NFI2020 project funded by the Finnish Ministry of Agriculture and Forestry.
Statement of data availability The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request. Restrictions apply to the availability of NFI sample plot data because they contain sensitive locational information, and the map data which was used under license from the National Land Survey of Finland.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.