Human impact on forests in early twentieth century Finland

In northern Europe, changes in forest ecosystem structures are commonly attributed to the ubiquitous impact of modern forestry. However, the starting point for modern forestry was not a pristine forest, but landscapes influenced for centuries by diverse human activities. Our aims were to (1) describe spatial patterns of forest structure and species compositions over large scales in 1920s Finland, and to (2) analyze how these characteristics were influenced by human population and past land-uses. We mapped ca. 3000 systematic sample plots measured in the first Finnish National Forest Inventory (1921–1924) and produced a series of maps of large-scale variation in forest characteristics in upland forests. We analyzed forest age and size structures, and species compositions relative to human population and land-use data. We found strong geographical and regional gradients in forest age and size structures, and tree species composition. Depending on the variable, these characteristics were at the stand-level best explained by human population density, reflecting the long history of various forest uses. Tree species composition was clearly associated with site productivity, but also with the history of slash-and-burn agriculture and forest grazing. Forest landscapes in the early twentieth century Finland exhibited a strong human fingerprint, visible as the abundance of young forests in populated areas, while in remote areas forest characteristics typical of natural forests prevailed. These gradients in human impact a century ago are still reflected as legacies in forest structure, a situation that needs consideration in management and restoration.


Introduction
In northern Europe, forest management practices associated with industrial-scale forestry have led to a major simplification of the structure and composition of forested landscapes during the past decades (Kuuluvainen 2009). These changes include homogenization of forest structures and reduction or disappearance of ecologically important structural elements, such as big and old trees and abundant dead wood, that are typical characteristics of natural forests (Lindenmayer et al. 2012;Gauthier et al. 2015). However, when evaluating the ecological impacts of different forest management approaches, it is important to realize that already for centuries prior to industrial-scale forestry, most forested landscapes were influenced by a diverse array and often local human activities (Tasanen 2004). Hence, local or regional knowledge is needed to understand what were-and are-their ecological consequences. This knowledge is also crucial in guiding forest ecosystem management and restoration, where the broader landscape context influences the effectiveness of local restoration measures (Tscharntke et al. 2012;Nordén et al. 2013).
Industrial-scale forestry tends to homogenize forests by creating similar structures all over, such as even-aged, spatially uniform and compositionally simplified stands. However, behind this "homogeneity mask" created by the industrial-scale forestry, the historical development and characteristics of the forest ecosystems preceding the transition to the industrial forests may vary substantially (Svensson et al. 2022). These variations include especially natural and human-caused disturbance history and associated structural legacies. In northern Europe, the natural structure and development of forests can be understood via the interaction between site type conditions, fire occurrence and tree species traits (Keeley and Pausas 2022), which jointly determine how forests develop and how they respond to natural and human disturbances (Rogers et al. 2015;Berglund and Kuuluvainen 2021).
In northern European boreal forests, low fertility xeric stands have historically been characterized by relatively frequent but low-intensity fires. These are sites where fire-tolerant Scots pine dominates, and often survives fires. As fire at the same time promotes regeneration, these stands and landscapes typically show age cohort structures (Aakala 2018). On the other end of the site-fertility spectrum, mesic sites undergo classic tree-species successions, in which fires are stand-replacing and deciduous pioneer species dominate early post-fire stands. These sites burn rarely and in the continued absence of fire, the fire-intolerant and relatively shade-tolerant Norway spruce gradually attains dominance. Stands are typically maintained by gap-and patchscale dynamics (Berglund and Kuuluvainen 2021), which promotes Norway spruce dominance with a mixture of deciduous trees that are able to regenerate in larger openings. Norway spruce shapes microclimates within the stands and the succession in ground vegetation leads to low probability of ignitions (Lindberg et al. 2021). But at the same time, these sites accumulate biomass and become characterized by horizontally and vertically continuous fuels. This increases the probability of high intensity burn if the forest is ignited (Kuuluvainen and Aakala 2011). In the intermediate sites between the xeric and the mesic sites, either Scots pine and Norway spruce may dominate. In these sites, species dominance depends on fires: frequent fires favor Scots pine and also increase the probability of fires. Absence of fire leads to Norway spruce gradually taking over, decreasing probability of future fires.
As a result of this type of stand-scale dynamics, landscape-scale tree species composition and tree age structure are expressed as a mosaic of different site types with their specific fire dynamics. As fires in mesic sites are rare stands tend to be old, whereas on xeric sites trees often survive the fires and attain old age. Old trees are thus an ubiquitous feature of the natural Fennoscandian landscapes regardless of site type variation (Pennanen 2002).
Fennoscandian boreal forests in Norway, Sweden and Finland have probably been under more intensive and long-lasting human influence than anywhere else in the boreal zone (Östlund et al. 1997;Kouki et al. 2001). During the past centuries, forests have been utilized in varying intensities and in ways that varied among regions. Generally, the southern parts of Fennoscandia were converted to fields and pastures already in the medieval times, while in the northern parts of Fennoscandia, forest structures have been more significantly modified only in the more recent past (Kuuluvainen et al. 2017). The changes in forest structure at stand, landscape and regional scales were created by various forest uses of increasing extent and intensity over time, and with increasing population density (Esseen et al. 1997). Because many changes in forest structure are cumulative and many structures restore slowly (Svensson et al. 2022), even a low-intensity forest use has the potential to lead to considerable changes in forest structure if it continues for long periods of time.
Historical patterns of forest utilization in Finland showed considerable regional differences. Areas around population centers in the southern part of the country were early converted to fields and pastures to feed the rapidly growing human population. In eastern Finland almost all productive forest land was used in slash-and-burn cultivation (Heikinheimo 1915). In coastal areas and along main watercourses that facilitated timber floating, shipbuilding and tar production and wood exports consumed large amounts of wood (Myllyntaus and Mattila 2002;Mälkönen and Redko 2005;Moore 2010), as did potash production (Kunnas 2007), and especially wood consumption for heating . In connection with timber extraction and slash-and-burn agriculture, extensive woodland grazing led to a decrease in timber resources by hampering forest regeneration (Östlund et al. 1997;Myllyntaus and Mattila 2002). This preindustrial use of forests led to local scarcity of timber resources in certain parts of Finland (von Berg 1995, but see also Hölttä 2013).
With the expansion of sawmill industry from the mid-nineteenth century onwards, forest harvesting became characterized by more extensive high-grading of the best quality sawn timber from easily accessible and transportable areas. The intensity and extent of this practice increased with the expansion of the forest industry, including successive waves of high-grading (Keto-Tokoi and Kuuluvainen 2014). With the emergence and expansion of the paper and pulp industry since the late nineteenth century, smaller trees also became valuable and were harvested (Myllyntaus and Mattila 2002). This unfolding process of industrial exploitation of increasing intensity, together with other domestic uses of the forests, eventually led to overutilization and local depletion of forest resources (Keto-Tokoi and Kuuluvainen 2014). One important reason for the depletion was that forest utilization placed scarce attention to forest regeneration. For example, the commonly practiced woodland grazing efficiently prevented forest regeneration, especially in areas where grazing was conducted in slash-and-burn areas (Heikinheimo 1915).
As a result of the long history of various forests uses, the volume of wood in Finnish forests reached a low-point in the beginning of the twentieth century (Myllyntaus and Mattila 2002). Previous studies have since shown how the onset of industrialized forest management has led to increases in the growing stock of forests, and in the amount of large trees (Henttonen et al. 2019(Henttonen et al. , 2020, starting from the 1920s (Tasanen 2004). However, because of the lack of infrastructure, and major differences in population density and livelihoods, it is likely that forest characteristics showed variability at different scales, and stark contrasts in different parts of the country.
In this paper, we present a spatially-explicit snapshot analysis of the state of the Finnish forests in the 1920s, when the first national forest inventory was carried out. Our aim was to analyze the presence of old trees in the forest landscapes, complexity of forest structures, and species composition prior to modern forestry. To reach these aims, we mapped the approximate locations of ca. 3000 sample plots of the first inventory and describe geographic variation in several different variables. We aim to answer the question of how forests in the 1920s deviated from natural forests, and how these were linked to humans and their livelihoods.

Study area
Our study area was the entire mainland region of Finland in the 1920s (Fig. 1). Except for the very southernmost areas of hemiboreal forest and the treeless areas of the northernmost Finland, the area belongs to the boreal zone. The bedrock in the area is made up of Precambrian granites and gneisses and covered by young Quaternary and Holocene sediments consisting mainly of podzolized moraines. The forested area exhibits relatively modest variation in topography, although the northern parts are characterized by gently rolling mountains (fells) with treeless summits.
In the area covered by the boreal forest, the mean temperature of the warmest month (July) ranges from approximately 18 °C in the south to 13 °C in the north (all climate averages reported here are for the period 1981-2010). The mean temperature of the coldest month (February) varies from − 5 °C in the south to − 13 °C in northern Finland. Precipitation generally varies from 700 to 750 mm in the southwest to 450-500 mm in the northeast.
The first National Forest Inventory 1921Inventory -1924 The first National Forest Inventory in Finland (NFI1) was carried out in 1921-1924. The sampling design has been described in detail in (Ilvessalo 1927), but in short, the inventory was based on diagonal inventory lines (Fig. 1). Three types of data were collected: (1) stand (compartment) data, using line-intersect sampling; (2) systematic sample plot data, using rectangular sample plots of 10 m × 50 m located along the lines 2 km apart; (3) plots for "valuable trees" (trees ≥ 30 cm diameter at 1.3 m height, i.e., the diameter at breast height, DBH) that were 10 m × 50 m extensions of the regular plots (i.e., 10 m × 100 m in total). In each stand, data were collected by visually estimating forest characteristics, such as volume, and tree species proportions. The systematic sample plots were then used to calibrate the visual estimates, and the forest resources were then assessed based on the calibrated visual estimates.
In this study, we used the systematic sample plot data of trees measured on the 10 m × 50 m sample plots. We deemed this data to be the most suitable of the three data types as it was based on measurements instead of visual estimation. In the extended plots for valuable trees, different inventory crews apparently had different criteria for which trees were considered valuable and hence we did not include these data here. On mineral soils, this data set consisted of 3074 plots in total.
The systematic sample plot data included measurements of tree diameters and species, age information for a sample tree in each DBH class (2 cm classes, i.e., 0-2 cm, 2-4 cm, …), as well as the identification of the stand in which the sample plot was located in. This stand data contained a number of additional variables, of which we used the information on site type (based on the Finnish site type classification), location along the inventory line, and information on forest grazing that was commonly practiced in many parts of the country (Henttonen et al. 2020). In preliminary analyses we saw discrepancies in the minimum size of the trees included in the measurement data: plots on some of the inventory lines contained data only on trees with diameter class ≥ 3 cm DBH (3 cm being the midpoint of the class containing 2-4 cm DBH trees), and some lines only on trees with diameter class ≥ 5 cm DBH. Hence, in all analyses, we used only trees belonging to the diameter class 5 cm DBH and up.  (Witting 1928), treeless areas (shown in white), and the approximate locations of the sample plots visible as (diagonal) lines from the Southwest to the Northeast, b state-owned land in 1925 (Witting 1928), and c the commonness or frequency of slash-and-burn agriculture in Finnish parishes in 1913 (based on Heikinheimo (1915) As noted by Henttonen et al. (2019) in their analysis of the same dataset, the data included cored tree ages in many of the stands (trees separated into 20-year age classes), but detailed documentation about the sampling procedure was lacking. Similarly, not all stands had this information, and the age data is probably not suitable for providing statistical estimates. We augmented the plot-level age data with ages recorded from the stand data, and nevertheless included the ages in our analyses as it is one of the most salient characteristics from the point-of-view of human influence on forests, and because old trees are a particular biodiversity concern (Piovesan et al. 2022). Our interpretation of these data is that it represents the minimum estimate for the age of the stand.

Determining plot locations and location uncertainty
The original maps detailing the inventory line locations have been lost (personal communication with the Finnish National Archives), but we used the published approximate locations of the inventory lines to locate the inventory plots (Ilvessalo 1927). We then used the data on the location of the stands along the inventory line, which were given as distance from the starting point. Locating the starting point for different parts of the line was rather complicated, and the details on the location procedure are given in supplementary material (Online Resource 1).
Once the stands were located, we tested the accuracy and validity of our locating procedure. In this, we took advantage of the fact that in the inventory data railroads were assigned into their own land-use class, meaning that the inventory crews recorded each time they crossed a railroad. We digitized the map of railroads in 1925 (Witting 1928), and calculated the distance between railroads located from the inventory data, and the true railroad locations, digitized from the railroad map. We then assumed that the shortest distance between the inventory record and the railroad on the map was indicative of the magnitude of the error in the location procedure. Of the assigned railroad compartments, 90% were located less than 8 km from the nearest railroad, which we considered a sufficient location accuracy for this study (Online resource 1). Finally, the sample plots were given the locations of their corresponding stands. If a compartment had multiple sample plots (as was the case especially in the North with large compartments), the sample plots were spaced 2 km apart.

Measures of forest structure and species composition
For the analyses we selected the following dependent variables that could be extracted or computed from the sample plot data (using the minimum 5 cm DBH size class, i.e., trees > 4 cm DBH): (1) tree species composition (basal area proportions of Scots pine (Pinus sylvestris), Norway spruce (Picea abies), birch (Betula pendula and Betula pubescens) and other deciduous tree species), (2) tree species richness (number of species), (3) tree species diversity index (using the Shannon diversity index), (4) the presence of large trees (≥ 40 cm DBH), (5) the age class of the oldest tree, and (6) the number of tree size classes present in each plot. For the latter, we reduced the number of classes to five: (4-10, 10-20, 20-30, 30-40, and 40 +), so that the number of size classes at each plot varied between 0 and 5. Main species in the group "other deciduous tree species" include trembling aspen (Populus tremula), alders (Alnus incana and Alnus glutinosa), and a variety of species that occur as individuals in mixed forests such as rowan (Sorbus aucuparia) and goat willow (Salix caprea).
We visualized the large-scale patterns in these variables with smoothed maps, using inverse-distance weighted interpolation (0.5 weight, 100 nearest points considered). To show the smaller-scale variability that is masked by the smoothing procedure, we used a leave-one-out approach, in which for each plot and each variable we first computed the smoothed raster without that sample plot, and then calculated the difference between the value in the plot data, and the smoothed value (Online resource 2). Visualizations were done in the gstat-package in R (Pebesma and Graeler 2022). We emphasize here that the smoothed values were used only for visualizing the larger-scale variability. In the statistical analyses (below), we used the plot data, not the smoothed data.
Population density and land-use patterns As a measure of human impact, we used spatially explicit data on population density, slash-and-burn agriculture, and forest grazing. For population density, depicting the long-term human pressure on forest resources especially for fuel-and construction wood, we obtained the data from the Atlas of Finland (Witting 1928). The population density was digitized (using ArcGis 10.3) as points, each point representing 100 persons at the point location. To use these point data as predictors, we divided the country into hexagonal grid cells with 10 km hexagon width and summed the point-based population within each grid cell. The gridded data was then used to create a raster, by inverse-distance weighted smoothing (with a distance exponent of 1; Geostatistical Analysis toolbox in ArcGis 10.3). Each sample plot in the NFI1 data was assigned the raster value as an approximation of the relative population density around the plot location. Population density is highest in the South and lowest in the North, and thus is correlated with annual temperatures. This means that the North-South variation in population density also encompasses much of any potential temperature effect, and the two are statistically difficult to tease apart. We will consider the interpretation for the population density variable in the Discussion as whether it is plausibly a population density effect, a temperature effect, or a mixture of both depends on the variable analyzed.
Data on slash-and-burn agriculture was obtained from the maps compiled by Heikinheimo (1915; Fig. 1), which show a classification of how common slash-and-burn agriculture was in Finnish parishes in 1860 and 1913. We first digitized the parish maps, assigned each parish a slash-and-burn class, and then assigned each NFI1 sample plot a slash-and-burn class of the parish it was located in. Since these two predictors (slash-and-burn in 1860 and 1913) were correlated, we only used the more recent 1913 data in our analyses. Forest grazing has been an important determinant for forest structure (Henttonen et al. 2020), and it was recorded for each compartment in the NFI1 data, and we used it as an additional predictor. Finally, we separated state-owned forests from private forests to assess whether forest structures were related to ownership. This information was obtained by digitizing a map of state-owned forests in 1925 (Fig. 1b), and assigning each sample plot a value based on whether it fell inside these areas or not.
We expected the human influence on forest structure, and the species composition naturally to differ according to the site type. We thus included site type that was included in the data, as an additional predictor in all models, (barren, xeric, sub-xeric, mesic, herb-rich; sensu Cajander 1926).

Determinants of species composition and oldest tree ages
We explored the role of human impact as a determinant of forest structure and species composition, using boosted regression trees, following the approach detailed by Elith et al. (2008). Boosted regression trees combine two different algorithms: regression trees (models that relate a response to their predictors by recursive binary splits) and boosting, in which many simple models are combined to improve predictive performance of the final model.
As decision trees in general, regression trees have the advantage that predictor variables can be of any type, and model outcomes are unaffected by different scales of measurement among predictors (Elith et al. 2008). We considered this flexibility advantageous as our predictor variables were obtained from a varied collection of data compiled into maps. Although this data is of varying quality, we consider their information content suitable for the purposes of this study, i.e., a descriptive analysis of relationships between these variables over large scales.
In running the boosted regression tree analysis (gbm-package in R, Greenwell et al. 2022), we used a bag fraction of 0.5 (i.e., at each iteration a 50% subsample of the data is drawn at random), and set a slow learning rate (0.001), to reduce overfitting (Elith et al. 2008). The number of final models varied between 2350 and 5600, well over the rule-of-thumb value of 1000 recommended by Elith et al. (2008). We report the cross-validation R 2 as our goodness-of-fit statistic. We report the importance of different variables and describe the shape of the relationship for the most important variables (full results in Online resource 3). The loss function used in fitting the models varied, depending on the response type. We used Gaussian loss functions for all continuous responses, and a Poisson loss function for the count variables species richness, and number of size classes. Tree species proportions were logit-transformed and modelled similar to continuous variables. Out of the originally included variables, the number of plots with large trees (DBH > 40 cm) was so low (see also Henttonen et al. 2020) that we dropped this variable from the statistical analyses, and simply display the location of these plots on the map. Spatial autocorrelation is a common problem in ecological models as many of the phenomena of interest have gradients, violating the assumptions underlying the statistical analyses. To account for this, we used the residuals autocovariate approach for boosted regression trees, suggested by Crase et al. (2012). Here, we first fit the models with predictor variables, and extracted model residuals. We then computed the mean of the residuals of two nearest neighbors, and refit the models with the residuals as an additional predictor. We then computed Moran's I values to verify the absence of spatial autocorrelation in model residuals (Online Resource 4).

Tree size classes, large trees, and forest age
The number of tree size classes in the sample plots showed broad-scale spatial variation so that the northern and eastern parts of the country had a generally more diverse tree size structure within the plots (Fig. 2a). Cross-validation (CV) correlation in the boosted regression models was 0.35 (Fig. 3), meaning that the variables used were not able to explain much of the variance in the number of tree size classes. Human population density was the most important determinant (relative importance was 43%), showing a U-shaped pattern (Fig. 4a) with highest diversity of tree sizes in areas with lowest populations. Other predictors showed only weak relationships with the number of tree size classes. Large trees (DBH size class 41 +) were present mainly in the small area in the East (Fig. 2a), and more consistently in northern Finland. The southern part of the country, and especially the western coastal regions had very low numbers of large trees.
The age class of the oldest tree in the sample plots, which we interpret as the minimum age of the forest, showed a sharp latitudinal gradient (Fig. 2b). The sample plots in the southern part of the country were almost devoid of old trees, whereas old trees were abundant in the northern plots. In the South, there were few deviations from the large-scale average pattern, whereas in the North there was much more, and much stronger fine-scale variation (see Online Resource 2). The age of the oldest trees was well explained by the model (cross validation r = 0.79, Fig. 3), and was clearly linked to population density (relative importance 81%)-the higher the population, the younger the forest (Fig. 4b). The other variables were not related to the age of the oldest trees.

Species composition and diversity
Pine and spruce both variably dominated the forest throughout the country (Fig. 5a-b), with their dominance pattern largely mirroring one another. Their proportion out of total basal area was equally well explained by the models (CV correlation 0.59 and 0.61 for pine and spruce, respectively), with the site type as the most important predictor, especially for pine (relative importance 60%, compared to 39% for spruce; Fig. 3). The barren and xeric sites were pinedominated, mesic and herb-rich were spruce-dominated, and the sub-xeric sites had shared dominance ( Fig. 4c-d). Spruce proportion was also influenced by slash-and-burn agriculture (22%) so that the areas without slash-and-burn agriculture or where it was discontinued earlier had higher proportion of spruce (Fig. 4e). The influence of other variables was minor.
For birch and other deciduous species, their share was consistently smaller compared to the two conifers ( Fig. 5c-d). Birch dominance was highest in the North, close to the elevational tree line. For the other tree species, they were rare, except for a region in the southeastern part of the country (Fig. 5d). The Fig. 2 Smoothed spatial patterns of a number of size classes (classes included trees with DBH 4-10 cm, 10-20, 20-30, 30-40, 40 + , and plots without any trees), as well as the presence of trees in class 41 cm DBH or larger (shown as white circles), and b the oldest tree age class in the sample plots. High-resolution versions of the images showing deviations of individual observations from the smoothed mean is available as Online Resource 2 models explained the proportions modestly for birch, but somewhat better for the other deciduous species, with cross-validation correlations of 0.39 for birch and 0.54 for other deciduous species (Fig. 3). For birch, population density was the most important variable, (relative importance 36%). Birch proportion decreased with increasing population density (Fig. 4f). For the other deciduous species, forest grazing (29%), slash-and-burn agriculture (20%), and site type (20%) were all important determinants, with positive relationships with the proportion of other deciduous species (Fig. 4g-i). These species were present mostly in the more fertile site types and abundant in the region with the occurrence of slash-andburn agriculture.
Tree species richness (number of species) and diversity (Shannon's H) both showed broad scale variation. Geographical differences in tree species richness were stronger than in the species diversity ( Fig. 5e,f). For species richness, southeastern, and central regions showed the highest richness. Correlation with the cross-validation data was 0.46 (Fig. 3). As with the proportions of individual species, site type was the most influential variable (36%). For species diversity, which accounts also for the evenness in addition to the number of species, the northern part showed the lowest diversity (indicating dominance of a single species). The boosted regression tree models explained this variation slightly better than with species richness (cross-validation correlation 0.50; Fig. 3). Site type (34%) had a positive relationship with tree species diversity. Population density was equally important (30%), showing a unimodal relationship with tree species diversity (Fig. 4l), where tree species diversity was low at low population density, and highest with average population densities. Fig. 3 Summary of the boosted regression tree analysis. Symbol colors show the overall crossvalidation correlation (CV; a measure of model goodness-of-fit) between the values predicted by the boosted regression tree models and observed data not included in model fitting. The size of the symbol shows the relative importance of each predictor variable Fig. 4 Partial dependence plots for important predictor variables (cross validation correlation × relative importance ≥ 0.1) in the boosted regression tree analysis. The plots show the shape of the relationship between a predictor and the predicted outcome, when all other predictor variables in the model are held constant. Individual plots show the relationships between a oldest tree age class and population density, b number of DBH classes and population density, c proportion of pine and site type, d, e proportion of spruce and site type, and slash-andburn agriculture, f proportion of birch and population density, g, h, i proportion of other deciduous species and site type, forest grazing, and slash-and-burn agriculture, j tree species richness and site type, and (k, l) tree species diversity and site type, and population density. Partial dependence plots for all variables are in Online Resource 3

Discussion
The question of forest "naturalness" in relation to the history of past human influence has been an important topic in many debates and studies (Brumelis et al. 2011). This question is important for defining forest reference conditions for ecosystem management and restoration (Berglund and Kuuluvainen 2021). However, a common problem in defining the stage or degree of forest naturalness in specific areas is the lack of comprehensive spatial and temporal data of type, extent, and intensity of past human impact. However, early forest inventories, either by forestry companies (Boucher et al. 2014) or by national forest inventories can be useful in providing the first snapshot data on forest conditions. Knowledge of forest naturalness and natural forest structure are based on information and syntheses from individual studies (Kuuluvainen and Aakala 2011). Our current understanding maintains that in natural conditions, Fennoscandian forests and forested landscapes are shaped by fire together with other "secondary" disturbances, especially in the long-term absence of fire (Kuuluvainen and Aakala 2011). Jointly with the impact of site type conditions, natural landscapes are characterized by continuous presence of old and large trees, and are dominated by either Scots pine or Norway spruce, or by deciduous pioneers in stands in early successional stages (Pennanen 2002).
With increasing human population density and different forest-related livelihoods, forest structures were increasingly directly altered by human land-use. While the modern forest landscapes are dominated by a similar management regime and consequently relatively homogeneous forest structure (Mönkkönen et al. 2022), prior to modern forestry, different forest uses often targeted different types of trees, site types, and localities. As a consequence of the multitude of partly overlapping forest uses, forest structures and compositions were also heterogeneous over a range of spatial scales.
In our analysis based on the early 1920s NFI in Finland, the most striking pattern was in the ages of the oldest trees in the sample plots. These showed almost a step-like change along a line from the northwestern end of the coastline, toward Southeast. Sample plots in forests south of this "old forest edge" were virtually absent of old trees, except for small pockets of poorly accessible locations. In contrast, old trees were present throughout the forested landscape in the North, similar to what we would expect from naturally-developing forest landscapes (Pennanen 2002). This does not mean that old trees were completely absent from the South, but that they were potentially rare, occurring in small patches not well captured by the sampling designed for statistically reliable regional estimates of forest resources.
The geographical patterns of forest age were also consistent with the analysis by Kalliola (1966), who used the presence of so-called kelo trees, decorticated, long-standing dead pine trees (Niemelä et al. 2002) as a proxy measure of forest naturalness in the later NFIs. Jointly with the age data here, it is evident that these characteristics of the natural forest and the associated long-term continuity of structures were still present in the North, despite the modification of the structure, species composition and age structures due to selective logging that had taken place also in all but the most remote regions by the early twentieth century. Similar logging was practiced also in northern Sweden (Östlund 1995).
For forest structures, lack of large trees especially in the South is well documented (Henttonen et al. 2019(Henttonen et al. , 2020, and driven by centuries of timber use for shipbuilding and as sawn timber especially from the mid-nineteenth century onwards (Tasanen 2004). However, our maps added detail in how the large trees were confined to the least accessible areas in the East and in the North, with only sporadic occurrences in the southern part of the country. In the North, their distribution was somewhat towards the East, with western parts better accessible by rivers suitable for timber floating. In general, in regions further away from population centers, accessibility plays a crucial role as a driving factor for landscape change (Antrop 2004;Garbarino et al. 2013). These eastern and northern areas lacked terrestrial infrastructure, and the discharging of the rivers (for timber floating) in these areas either to the Arctic Ocean or the White Sea (far from the main export markets in Europe) also hampered the transport of timber. The sawmills of export ports were located mostly in the southern and western parts of the country, where also shipbuilding activities were centered (Tasanen 2004). Early mechanizations increased the range of forest harvesting in the early twentieth century, but nevertheless some of the areas still seem to have retained a higher share of the large trees of the natural forests.
Structural diversity, simplified here into plot-level number of tree size classes (from 0 to 5), resembled the oldest tree age patterns, with highest numbers of tree size classes generally in areas with old trees, and obviously in areas where large trees over 40 cm DBH were present. What this means is that those areas that had large trees also displayed a continuity of size classes. Although the fairly small plot size prevents a more detailed analysis of the diameter distributions at the plot level, the range of size classes in plots with old and/or large trees is consistent with natural forest structures. In pine-dominated forests, the variety of size classes typically occurs as cohorts of trees of different age, reflecting the history of frequent surface fires. In spruce-dominated old forests, uneven size structure is maintained through the gap-and patchscale dynamics (Kuuluvainen and Aakala 2011). In addition, as old trees are not necessarily large (Henttonen et al. 2019), and their dynamics may differ greatly, adding the age information adds more detail to the overall picture how human use of the forests had influenced the forests. In the North, old trees are ubiquitous, even in areas with few large trees, which is consistent with high-grading that has a diameterlimit and otherwise limited human impact. In the South, both large and old trees are largely missing.
This strong contrast between the North and the South shown on the large-scale interpolated maps was in the stand-scale analysis closely related to the general human influence on forests, as captured by the population density in 1925 in the boosted regression tree modeling. This pattern is probably a reflection of extensive, long-term use of the forests: localities with highest populations are typically the locations that have also been inhabited the longest. Indeed, a preliminary analysis (not shown) with old, spatiallyexplicit census data from 1749 showed that local population densities correlate well between 1749 and 1925. We posit that this effect of population density on the age of the oldest trees in the forest encompasses the cumulative use of forests through time for most locations. High population density in an area suggests a high pressure for forest resources for, e.g. timber for construction and other household needs, but above all for fuelwood (Tasanen 2004). In addition to household use of wood, there were regionally varying forest uses that targeted different trees (e.g., shipbuilding, tar extraction, sawn timber, ironworks) or different site types (slash-and-burn agriculture; Heikinheimo 1915), but it seems evident that at centennial time scales and presence of permanent settlements all lead to a depletion of old trees in the forest landscapes. State-owned forests were for a very long time utilized freely, which explains why forest ownership did not explain much of the variation in the characteristics analyzed (Tasanen 2004).
In Fennoscandian natural forests, productive sites are typically deciduous-dominated in early successional stages, and spruce-dominated with an admixture of deciduous trees in late-successional stages. Xeric sites are naturally pine-dominated through stand development. Sub-xeric sites where both conifers are able competitors, species composition is determined by fire history: frequent fires promote pine dominance, while long-term absence leads to spruce dominance (Kuuluvainen et al. 2017). Tree species dominance in our data largely followed this pattern we expect from site types under natural conditions. For the most parts, forest landscapes were partitioned between the two dominant conifers with mirrored large-scale spatial patterns, and mirrored relationship with site type in the partial dependence plots. Thus, even if human use of the forests had had a major influence on the growing stock and the tree size structure of the forests (Ilvessalo 1927;Henttonen et al. 2019), in terms of the species proportions the human fingerprint was less evident. This is obviously the result of the lack of artificial regeneration, tending of young stands and intermediate harvests that are nowadays used to guide species compositions to meet current expectations of future needs. While the boreal forest landscape consists of a mosaic of different site types, these are not evenly distributed, and hence these large-scale patterns likely partly reflect the large-scale variation in site types. In the very North, spruce dominance is weak owing to its more southerly distribution, and pine and birch naturally dominate. The high proportion of birch with low population densities is due to birch dominance in the North, where it is the treeline-forming species. Hence, the relationship between population density and proportion of birch is most likely a temperature effect.
In addition to site types, land-use history influenced species composition based on the stand-scale analyses. In particular, the high proportion of the group of other deciduous trees in the southeastern part of the country stands out in the interpolated maps. These trees typically occur only as admixtures in low proportions. In this region their presence was the result of slash-and-burn agriculture that through the past centuries was practiced very commonly over the region, influencing especially the initially spruce-dominated, more productive sites. This is well-documented in the decline of spruce-dominated forests in paleoecological research (Tolonen 1985;Pitkänen and Huttunen 1999). The abandoned cultivated areas then naturally regenerated by deciduous trees (Heikinheimo 1915), which is still visible in the structures of forests that have avoided clearcutting (Čugunovs et al. 2017). Forest grazing that was also a common practice in the region subsequently impacted the species composition, by selective herbivory and trampling by forest-grazing cattle. In particular, this forest grazing led to the increase in poorly palatable Alnus at the expense of the other tree species. Outside of this region with high proportion of other deciduous trees, the proportion of birch stands out, which probably reflects the development of species composition post slash-and-burn agriculture but with a lower grazing pressure, which acts in favor of birch as a stronger competitor (Moretto and Distel 1997).
Tree species richness (i.e., the number of tree species present in a sample plot) shows little geographical pattern and was at stand-scale best (albeit overall weakly) explained by site type with mesic sites having a higher tree species richness. Over larger scales, the most prominent feature was the low species richness in the North. However, species diversity, accounting for how even species share is, shows more variation. The models linked these with site types as expected, and it was promoted by slash-and-burn agriculture. In the North, the low species diversity is linked with the old age of the forests: on mesic sites spruce dominates with an admixture of birch (Siren 1955), and rare occurrence of other deciduous trees, and on xeric sites pine is often the sole tree species. Highest tree species diversity was in central Finland, likely driven by a combination of large proportion of productive site types, and history of slash-and-burn agriculture that diversified the tree species composition.
There are some important limitations in the analysis here, related to the high small-scale variation in some of the variables that we analyzed, and the approximate nature of the independent variables, limiting our findings to somewhat qualitative level as opposed to regional statistical estimates in earlier studies (Henttonen et al. 2019(Henttonen et al. , 2020. Nevertheless, we believe these findings help understand the development of forest structures between the natural and the modern forests, development of biodiversity (including extinction debts), and how humans shaped these forests in the past.
In recent mappings of forest landscape intactness, some historical human influence such as selective logging does not necessarily result in the loss of landscape intactness (Potapov et al. 2017). Using the presence of old trees as a criterion, it is clear that the intact forest landscapes were "lost" already earlier in the South, but suggests that there has been a long-term, large-scale continuity in the intactness of the forest landscapes until the early twentieth century in the northern and eastern part of the country. Such continuity is nowadays seen irreplaceable (Watson et al. 2018), especially since many characteristics of intact forests have very long recovery times (Svensson et al. 2022).

Conclusions
Our study shows how spatially-explicit information from an early National Forest Inventory can open new views on the state of the forests in the past, which also helps understanding the current state of the forests and forested landscapes. Considering the natural forest and landscape structure as a reference, characterized by (1) abundance of old trees, (2) abundance of large trees, and (3) dominance of pine and spruce with an admixture of deciduous trees, there was a striking contrast between the North and the South of Finland in the 1920s. Northern landscapes had retained many of their key natural characteristics until the early twentieth century, despite a history of selective logging and a millennia-long extensive human use. These characteristics include the presence of old trees across the landscapes and large trees in the eastern and northeastern parts of the country and, in the absence of artificial regeneration the site type-dependent species composition. In the South, old trees and large trees were practically absent in many places. Human land-use, especially the slash-and-burn agriculture promoted tree species diversity by increasing the amount of naturally-regenerating, early-successional deciduous species, but this occurred at the expense of structural diversity. Longevity of live and dead tree structures due to the slow ecological succession in the cold boreal forest means that historical legacies may persist in forest structures over centennial to millennial time scales. We conclude that the strong human impact on boreal forests a century ago is likely to be currently reflected as historical legacies from past forest ecosystem structures, a situation which deserves attention both in forest ecological research and in forest management and restoration.