Background & Summary

The importance of primary forests is widely recognized1,2. First, they provide refuge to forest biodiversity3, and act as a buffer to species loss in human-dominated landscapes4. Second, primary forests play an important role in climate change mitigation. At the local scale, they buffer the adverse effects of increasing temperature on understory biodiversity, as they often have cooler forest-floor summer temperatures compared to secondary forests5. At the global scale they contribute to climate stability by storing large quantities of carbon, both in the biomass and in soils1,6,7. Third, primary forests often serve as a reference for developing close-to-nature forest management, or for benchmarking restoration efforts8. Finally, these forests are an irreplaceable part of our natural heritage, shape the cultural identities of local communities, and have a high intrinsic value9.

In Europe, as in many human-dominated regions, most forests are currently managed10, often with increasing harvest intensities11,12. As a result, despite the general trend of increasing total forest area, primary forests are scarce and continue to disappear13. For instance, Romania hosts some of the largest swaths of primary forest in Central Europe and faced a sharp increase in logging rates since 2000. This has resulted in significant primary forest loss, even within protected areas13,14,15. In Poland, the iconic Białowieża Forest was recently in the spotlight after the controversial decision from the Polish National Forest Holding, now nullified by the Court of Justice of the European Union16, to implement salvage logging followed by tree planting after a bark beetle outbreak17. Widespread loss of primary forests also occurred in Ukraine18, Slovakia19, or in the boreal North, e.g., in the Russian North-West, where 4.6 Mha of primary forest were lost since 200113,20. Effective protection of Europe’s primary forests is therefore urgently needed21.

In the newly released ‘Biodiversity Strategy for 2030’, the European Commission emphasized the need to define, map, monitor and strictly protect all of the EU’s remaining primary forests2. Reaching these objectives requires complete and up-to-date data on primary forests’ location and protection status. Such data could inform both policy making and conservation planning, as well as research, for instance by highlighting areas where primary forests are either scarce, or poorly studied. Yet, many data gaps remain on the location and conservation status of EU’s primary forests21,22. Only a few countries conducted systematic, on-the-ground inventories19,23. For most countries data are either only available for a few well-studied forests24,25,26, or are limited to the distribution of potential (=unconfirmed) primary forests, typically predicted statistically or via remote sensing27,28,29. Despite past efforts for harmonizing data24,30, only recently has the first map of primary forests been released for Europe31,32 together with a first assessment of their conservation status21.

In a previous effort, we assembled a first European Primary Forest database (EPFD v1.0) that included 32 local-to-national datasets, plus data from a literature review and a survey, resulting in the mapping of a total of ~1.4 Mha of primary forest31. This was only about one fifth of the estimated 7.3 Mha of undisturbed forest still occurring in Europe, excluding Russia10. Also, most of the data collected in our v1.0 database were not open-access, and could thus not be used without the explicit consent of their respective copyright holders.

Here, we build on those efforts to progress further towards a complete map of Europe’s primary forests. First, we secured permission from all data holders to release all data with open-access. Second, we aggregated and harmonized 16 additional regional-to-continental spatial datasets to now cover a total of 48 independent datasets. The EPDF v2.0 contains 18,411 non-overlapping primary forest patches (plus 299 point features) covering an area of 41.1 Mha (37.4 Mha in European Russia alone; Fig. 1) across 33 countries (Table 1)33. Key improvements of this new database include (a) filling major regional gaps, including European Russia, the Balkan Peninsula, the Pyrenees and the Baltic region, (2) mapping potential primary forests for Sweden and Norway (additional 16,311 polygons and 2.5 Mha - Fig. 2), two key regions where complete inventories are currently unavailable, and (3) an update of our literature review to January 2019.

Fig. 1
figure 1

Overview of the primary forest patches contained in the EPFD v2.0. Both points and polygons were magnified to improve visibility.

Table 1 Summary of primary forest data across European countries.
Fig. 2
figure 2

Overview of the maps of potential primary forests of Sweden and Norway.

Methods

Primary forest definition

Defining primary forests is controversial, and a range of different definitions have been put forward over the years22. In this paper, as in our previous work, we follow the FAO definition that defines primary forests34 as “naturally regenerated forest of native tree species, where there are no clearly visible indications of human activities and the ecological processes are not significantly disturbed”.

We operationalized this definition using the framework proposed by Buchwald35, where ‘primary forest’ is used as an umbrella term to include forests with different levels of naturalness, such as primeval, virgin, near‐virgin, old‐growth and long‐untouched forests. Based on this framework, a forest qualifies as primary if the signs of former human impacts, if any, are strongly blurred due to decades (at least 60–80 years) after the end of forest management35. This time limit, however, depends on how modified the forest was at the starting point, and only applies in the case of traditional management, such as patch felling, partial coppicing, or selective logging. Stands regenerating naturally after a clear cut would therefore require a longer time period to be considered a primary forest (i.e., 60–80 years plus the length of a typical rotation cycle). Our definition of primary forests, therefore, does not imply that these forests were never cleared or disturbed by humans. We consider this is in line with the Convention of Biological Diversity (CBD, https://www.cbd.int/forest/definitions.shtml), acknowledging that the concept of primary forests has a different connotation in Europe than in the rest of the world.

Finally, our collection of primary forests includes mainly old-growth, late-successional forests, but also some early seral stages and young forests that originated after natural disturbances and natural regeneration, without subsequent management. In case of large primary forest tracts (>250 ha), our polygons can also locally include land not covered by trees.

Data collection

To create the EPFD v2.0, we first expanded and updated the literature review on primary forests we had originally carried out for EPFD v1.031, which only considered the period 2000–2017, and excluded European Russia. Specifically, we added all scientific studies published between January 2000 and January 2019 for Russia, and those published in 2017–2019 for the rest of Europe. We identified relevant publications in the ISI Web of Knowledge using the search terms “(primary OR virgin OR old‐growth OR primeval OR intact) AND forest*” in the title field. Based on our own interpretation of commonly used forest terms, we deliberately excluded terms such as “unmanaged” (meaning: not under active management), “ancient” (never cleared for agriculture) or “natural” (stocked with naturally regenerated native trees). These terms indicate conditions that are necessary, but not sufficient for considering a forest as primary. Finally, we refined our search using geographical and subject filters. The literature search returned 129 candidate papers. After screening their content, we added 23 additional primary forest stands (10 in European Russia, 13 in the rest of Europe), from 13 studies (four from European Russia, and nine from the rest of Europe).

Building the EPFD v1.031 involved reaching out to 134 forest experts. For v2.0 we contacted an additional 75 experts with knowledge on forests or forestry, and invited them to add spatially-explicit data on primary forests to our database. We focussed on experts from geographical regions poorly covered in v1.0. We received 56 answers, which led to the incorporation of 16 new datasets in our map. Given the context-dependency of definitions used in regional mapping projects, new datasets were only included if we could find an explicit equivalence between country-specific forest definitions and our definition framework35. This was done after discussing with data contributors the criteria and categories used for constructing their datasets, which we then mapped onto our definition framework. Depending on the datasets, these criteria included: (1) forest age or structural variables19,23,36, (2) legal designation25 or year since onset of protection37, (3) time since last anthropogenic disturbancee38, or (4) the lack of human impacts and infrastructures39.

We integrated all data into a geodatabase, which contains primary forests either as polygons (if information on the forest boundary was available) or point locations (when having only an approximate centre location). We set 0.5 hectares as minimum mapping unit, although only a few of the datasets already contained in v1.0 contained polygons smaller than 2 ha (i.e., the minimum mapping unit originally used). If available, we included a set of basic descriptors for each patch: name, location, naturalness level (based on35), extent, dominant tree species, disturbance history and protection status. In total, our map harmonizes 48 regional-to-continental datasets of primary forests (Online-only Table 1). All data is open-access33, except for three datasets that we kept confidential, either for conservation or copyright reasons. These datasets are: ‘Hungarian Forest Reserve monitoring’ (ID 17, custodian: Ferenc Horváth); ‘Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe’40,41 (ID 34, copyright: UNESCO), and ‘Potential OGF and primary forest in Austria’ (ID 48, custodian: Matthias Schickhofer). Additional non-open access polygons also exist for the dataset ‘Strict Forest Reserves in Switzerland’ (ID 30, custodian: Jonas Stillhard). These data are here referred to for transparency, but are neither included in the statistics and summaries reported here, nor in any of the remote-sensing analysis below.

Post-processing

To provide common descriptions for all features contained in the geodatabase, we integrated the basic descriptors detailed above with a range of attributes derived by intersecting all polygons or points of primary forests with layers of: 1) biogeographical regions, 2) protected areas, 3) forest type, and 4) forest cover.

We used the map of biogeographical regions42 to assign each primary forest point or polygon to one of the following ten classes: 1. Alpine, 2. Arctic, 3. Atlantic, 4. Black Sea, 5. Boreal, 6. Continental, 7. Macaronesia, 8. Mediterranean, 9. Pannonian, 10. Steppic. Similarly, we derived information on protection status and time since onset of protection for each primary forest polygon or point based on the World Database on Protected Areas (WDPA - https://www.protectedplanet.net). We simplified the original IUCN classification to three classes: 1. strictly protected – (IUCN category I); 2. protected – (IUCN categories II-VI + not classified); 3. not protected. This is a conservative aggregation recognizing the fact that, in certain contexts, logging and salvage logging are allowed inside national parks, at least in the buffer zone. In case of polygons, we considered a primary forest patch as protected if > 75% of its surface was within a WDPA polygon. When better information on the protection status of a forest patch was available directly from data contributors, we gave priority to this source. We also assigned each primary forest polygon or point to one of the forest categories defined by the European Environmental Agency43. The spatial information was derived by simplifying the map of Potential Vegetation types for Europe44, after creating an expert-based cross-link table21, which ties together forest categories and potential vegetation types reported in Table 4.1 from43. After excluding forest plantations, the remaining 13 categories comprised: 1. Boreal forest; 2. Hemiboreal forest and nemoral coniferous and mixed broadleaved-coniferous forest; 3. Alpine coniferous forest; 4. Acidophilous oakwood and oak-birch forest; 5. Mesophytic deciduous forest; 6. Lowland to submountainous beech forest; 7. Mountainous beech forest; 8. Thermophilous deciduous forest; 9. Broadleaved evergreen forest; 10. Coniferous forests of the Mediterranean, Anatolian and Macaronesian regions; 11. Mire and swamp forest; 12. Floodplain forest; 13. Non-riverine alder, birch or aspen forest. For each primary forest polygon (but not for points), we reported the two most common forest categories. Finally, we extracted for each primary forest polygon the actual share covered by forest. We did this, because larger primary forest polygons in high naturalness classes can encompass land temporarily or permanently not covered by trees. We used a tree cover density map for the year 2010 for these regions from45. All post-processing was performed in R (v3.6.1)46.

Potential primary forests of Sweden and Norway

For Sweden and Norway, where abundant geographic information was available on forest distribution, we created maps of potential (but so far unconfirmed) primary forests. For Sweden, we derived a workflow to create a map of potential primary forests as detailed in Fig. 3. This yielded 14,300 polygons covering a total area of 2.4 Mha.

Fig. 3
figure 3

Workflow and data sources for the map of potential primary forests in Sweden. Data on woodland key habitats derive from60 (see also: https://www.skogsstyrelsen.se); forest with conservation value from61,62, forest core areas from63, continuity forests from64,65, protected mountain coniferous forests from66, clear cuts and fellings from https://www.skogsstyrelsen.se.

For Norway, even though we were able to include two datasets of confirmed primary forests, additional primary forest is expected to exist. Therefore, we derived a map of potential primary forests, based on the “Viktige Naturtyper” dataset from the Norwegian Environment Agency47, which maps different habitat types of high conservation value both inside and outside forested areas. We extracted all polygons larger than 10 ha classified as “old forest types” (=“gammelskog”), i.e., forests that have never been clearcut and are in age classes of 120 years or older. This yielded 2,103 polygons covering a total area of 0.1 Mha.

Importantly, these layers were neither directly integrated in our composite map, nor used to calculate country level statistics as they only represent a first approximation of the primary forest situation in these countries, so far without ground validation. Yet, we included these layers in our geodatabase with the goal of directing future ground-based mapping efforts.

Data Records

The EPFD v2.033 is composed of 48 individual datasets (Online-only Table 1) and the two layers of potential primary forests for Sweden and Norway. We integrated the 48 datasets into two composite feature classes, after excluding all duplicated\overlapping polygons across individual datasets.

  1. 1)

    EU_PrimaryForests_Polygons_OA_v20

  • Composite feature class combining the forest patches classified as “primary forest” based on polygon data sources described in Online-only Table 1

  • Data type: Polygon Feature Class

  1. 2)

    EU_PrimaryForests_Points_OA_v20

  • Composite feature class combining forest locations classified as “primary forest”, based on point data sources described in Online-only Table 1. Only points not overlapping with polygons in (1) reported.

  • Data type: Point Feature Class

The individual datasets are also included in the geodatabase, inside the feature dataset ‘European_PrimaryForests’. The whole database is stored in Figshare (https://doi.org/10.6084/m9.figshare.13194095.v1)33. The file format is ESRI personal geodatabase (.mdb). Each feature class in the geodatabase follows the structure described in Online-only Table 2. A full description of each individual dataset is reported in the metadata file ‘DATASET_overview_v2.0_20201030.docx’, available at the same link.

Technical Validation

We benchmarked our data against country-level statistics on primary forest extent. Although we had no direct control of the raw data contained in our database, the fact that all our information on primary forest locations derives either from peer-reviewed scientific literature, or was field-checked by trained researchers and/or professionals suggests high data reliability. We made sure to have a common understanding with data contributors about forest definitions [i.e.34,35,], and only included a dataset in the EPFD if we could find an explicit equivalence with the forest definitions we used. Additional information on the harmonization process is reported for individual datasets in the metadata accompanying our geodatabase.

An additional, wall-to-wall validation of our database using remotely-sensed information is currently impossible. Remote sensing data only cover the last 35 years, and even if high resolution laser ranging (LIDAR) might become available in the future, at the moment no reliable workflow exists for mapping primary forests from such multi-sensor data. The alternative is field work, which is clearly unfeasible given the huge area covered by our database, the large number of polygons, and the cost and time effort that would be required for a statistically valid ground sample of data. Still, remote sensing data can be helpful for checking whether a patch of primary forest underwent human disturbance after it was delineated and that is why we implemented a semi-automatic procedure based on Landsat satellite-image time series (1985–2018) (see below).

Benchmarking against country-level statistics

Our database contains most of the geographical information currently available on primary forests in Europe, but we do not claim this data is complete. To benchmark the completeness of our map, we calculated the ratio between the area of primary forest in our database at country level, and the estimated area of “forest undisturbed by man” from the indicator 4.3 in the Forest Europe report10 or, for those countries where this information is not available, from FAO’s Forest Resources Assessment48. Although the definition of “forest undisturbed by man” in Forest Europe is consistent with our definition of primary forest10, it must be noted that these country-level estimates stem from national inventories or other studies, and data quality varies from country to country49. The comparison presented here should, therefore, be taken with caution (Fig. 4).

Fig. 4
figure 4

Estimation of data completeness. Ratio between the total primary forest area in the EPFD v2.0 and the country estimate of ‘forest undisturbed by man’ (indicator 4.3) from Forest Europe10 or, if unavailable, the country estimates of primary forests based on FAO’s Forest Resources Assessment48. Gray polygons represent countries where Forest Europe (or FAO) reports no forest undisturbed by man (‘No Reported PF’).

Forest Europe reports no primary forest for some western European countries (Spain, France, Belgium, Netherlands, Germany, United Kingdom and Ireland), although for most of these countries we did find information on at least a handful of primary forest sites. The coverage of our map was also higher than expected for some Eastern European countries (e.g., Ukraine, Belarus, Lithuania), as well as Norway and Finland, known for hosting large areas of primary forests. Data completeness was lower for some central European countries. In the case of Czechia, Slovakia, Poland and Romania, our data only accounted for 20–100% of the country-level estimates from Forest Europe10. For Austria, Switzerland and Hungary, instead, additional data on primary forests exists but it is not currently open-access, and therefore not considered here. The largest data gaps were in Sweden, Italy, Bulgaria, Estonia, Denmark and Russia, where our map accounted for less than 10% of the primary forest reported in Forest Europe10. The low data completeness found for Denmark likely depends on the inclusion of minimum-intervention forest reserves in Forest Europe (see49) that were harvested until recently and therefore do not qualify as primary forests according to our definition.

Assessing recent human disturbance with remote sensing

Since our data were collected continuously over the last two decades, we cannot exclude that some forest patches may have undergone human disturbance after data collection. This is particularly relevant for areas where primary forests are lost at high rates, such as the Carpathians, Russian Karelia, or Northern Fennoscandia18,19,20. To assess to what extent this might be an issue, we used the open-access Landsat archive and the LandTrendr disturbance detection algorithm50,51, using Google Earth Engine52 (Fig. 5). Specifically, we 1) quantified the proportion of polygons in our map that underwent disturbance between 1985 and 2018, i.e., Landsat 5 operating time, 2) visually checked a stratified random selection of these disturbed polygons to quantify the prevalence of anthropogenic vs. natural disturbance, and 3) estimated the proportion of polygons in our map not meeting the necessary, but not sufficient, condition for being classified as primary (i.e. not being affected by anthropogenic disturbance within the last 35 years).

Fig. 5
figure 5

Workflow of the assessment of recent human disturbance in primary forest polygons.

For each polygon contained in the map of primary forests, we extracted the whole stack of available Landsat images (~1985-today), and ran the LandTrendr53 algorithm. LandTrendr identifies breakpoints in spectral time series, separates periods of disturbance or stability, and records the years in which disturbances occurred. To avoid problems due to cloud cover, changes in illumination, and atmospheric condition, we used all available images from the growing season of each year (1 May through 15 September) to derive yearly composite images54. As our spectral index, we used Tasseled Cap Wetness (TCW), as this index is particularly sensitive to forest structure55, is robust to spatial and temporal variations in canopy moisture56, and consistently outperforms other spectral indices, including Normalized Difference Vegetation Index53, for detecting forest disturbance50,57,58,59. As input parameters for the LandTrendr algorithm when detecting forest disturbances, we used a prevalue of −300 TCW units, a minimum disturbance magnitude of 500 TCW units, and a maximum duration of 4 years.

After running LandTrendr, we eliminated noise by applying a minimum disturbance threshold (2 ha). We then visually inspected a stratified random selection of primary forest polygons highlighted as ‘disturbed’ by LandTrendr using very-high-resolution images available in Google Earth. For each biogeographic region, we randomly selected 20% of disturbed polygons up to a maximum of 100 polygons per region. Depending on the size of the polygons, we inspected up to 5 randomly selected disturbed pixels within each disturbed polygon with a minimum distance between pixels of 1 km. Based on the spectral and physical characteristics of the disturbed patch (brightness, shape, size), and on ancillary information derived from the Google Earth imagery, we assigned disturbance agents as either anthropogenic (i.e., forest harvest, infrastructure development) or natural (e.g., windstorm, bark beetle outbreak, fire; Figs. 6, 7). We conservatively considered a polygon as anthropogenically disturbed if at least a third of the points we checked for that polygon were anthropogenically disturbed. To avoid introducing an observer bias, all polygons were checked by the same photo-interpreter (FMS).

Fig. 6
figure 6

Examples of disturbed polygons, as detected by LandTrendr, before (left) and after (right) disturbance. (a) Natural disturbance in Babia Gora, Slovakia; (b) natural disturbance in the southern Bourgas Province of Bulgaria; (c) clear-cuts in Tatra National Park in Slovakia; (d) clearcuts in the Russian Republic of Karelia. Red circles are centred on the disturbed pixel randomly selected for visual inspection, and have a radius of 50 m; pink squares have a side of 1 km and were exclusively used to provide context reference to the photointerpreter. Image credits: Google Earth.

Fig. 7
figure 7

Geographical distribution of naturally vs. anthropogenically disturbed polygons, as resulting from a visual check of 712 pixels across 268 polygons.

Out of the 17,309 polygons checked with LandTrendr, 4,734 (27.3% of total) experienced major disturbances between 1985 and 2018. The proportion of disturbed area was greater than 10% in 2,904 polygons. We visually inspected a total of 712 pixels across 268 primary forest polygons, corresponding to 1.5% of the total number of polygons and 5.7% of the disturbed polygons. We attributed a total of 149 pixels, across 61 primary forest polygons, to anthropogenic disturbance, i.e., 22.7% (bootstrapped standard error = 2.5%) of the polygons we checked (Table 2, Fig. 7). We thus estimated the total number of primary forest polygons being anthropogenically disturbed by multiplying the total number of polygons with the proportion of disturbed polygons (27.3%) and the share of these disturbed polygons attributed to anthropogenic causes (22.7%). This suggests our map contains 1,077 anthropogenically disturbed polygons (95% CIs [847, 1323]), which corresponds to 6.2% (95% CIs [4.9%, 7.6%]) of the total number of polygons. Disturbed polygons were concentrated in the Russian Federation (especially in Archangelsk region, Karelia and Komi republics), Southern Finland, and the Carpathians (Fig. 7; Table 2). The Boreal and Alpine biogeographical regions had the highest number of disturbed polygons (both in total, and when considering only those with evident anthropogenic disturbance). The regions with the highest share of anthropogenically disturbed polygons were the Continental and Boreal region. The sample size in Macaronesia was too low to provide a reliable estimation of the incidence of human disturbance.

Table 2 Recent human disturbance in primary forest polygons, summarized by biogeographical region.

These estimates should be considered as lower bounds, because only the disturbance events with a magnitude sufficient to be captured with LandTrendr and occurring in 1985–2018 could be identified. Not being this a formal validation, the results presented here should not be extrapolated to primary forests not included in our map. Finally, being our database built with a bottom-up approach, we are unable to exclude the existence of remaining bias or interpretation error, which might have propagated through the successive steps required to build it. As such, we warn the users against possible heterogeneity in data quality, accuracy and completeness across datasets.

Usage Notes

All data files are referenced in a geographic coordinate system (lat/long, WGS 84 - EPSG code: 4326). The provided files are in a personal geodatabase, and can be accessed and displayed using standard GIS software such as: QGIS (www.qgis.org/en).

All datasets listed in Online-only Table 1 are freely available in Figshare (https://doi.org/10.6084/m9.figshare.13194095.v1)33 with a Creative Commons CC BY 4.0 license. Two additional non open-access datasets are available on request to the corresponding author after approval of the respective copyright holders. These datasets are: ‘Hungarian Forest Reserve monitoring’ (ID 17, custodian: Ferenc Horváth); and ‘Potential OGF and primary forest in Austria’ (ID 48, custodian: Matthias Schickhofer). The same conditions apply for additional data from the dataset ‘Strict Forest Reserves in Switzerland’ (ID 30, custodian: Jonas Stillhard). In the case of the dataset ‘Ancient and Primeval Beech Forests of the Carpathians and Other Regions of Europe’40,41 (ID 34, Custodian: UNESCO), this data is freely available online, but its copyright does not allow redistribution. We refer the interested reader to the website https://www.protectedplanet.net/903141 for the original data.

Comments and requests of updates for the dataset are collected and discussed in the GitHub forum: https://github.com/fmsabatini/PrimaryForestEurope.