Planning the review
The design of this systematic map was established in detail in an a priori protocol . It follows the guidelines for systematic reviews and evidence synthesis issued by the Collaboration for Environmental Evidence .
As described in the protocol, we established the scope and focus of the map in close cooperation with stakeholders, primarily in Sweden. Before submission, peer review, revision and final publication of the protocol, a draft version was open for public review at the website of the Mistra Council for Evidence-Based Environmental Management (EviEM) in May 2014. Comments were received from scientists, environmental managers and other stakeholders, and the protocol was revised accordingly.
Searches for literature
When searching for relevant literature, we used online publication databases, search engines, specialist websites and literature reviews. Whenever possible, we applied the search terms specified below. In many cases, however, the search string had to be simplified as some sites could handle only a limited number of search terms or did not allow the use of ‘wildcards’ or Boolean operators.
No time, language or document type restrictions were applied.
Initially, we conducted a scoping exercise to assess alternative search terms, testing them against a set of about 20 articles known to be relevant. This resulted in a preliminary search string that was used for the main part of the literature searches. Based on suggestions by stakeholders and on the terminology in relevant papers found with the preliminary search terms, a few terms were later added to the search string and used in a set of supplementary searches. The final selection of search terms was as follows:
Subject: forest*, woodland*, “wood* pasture*”, “wood* meadow*”
Forest type: boreal, boreonemoral, hemiboreal, nemoral, temperate, conifer*, deciduous, broadlea*, “mixed forest”, spruce, “Scots pine”, birch, aspen, beech, “Quercus robur”, Swed*
Intervention: conserv*, restor*, rehabilitat*, “active management”, (prescribed OR control* OR experiment*) AND (burn* OR fire*), thinn*, (partial OR selecti* OR gap OR retention) AND (felling OR cutting OR harvest*), “green-tree retention”, *introduc*, remov*, graz*, brows*, girdl*, ditch*, flood*, fenc*, exclos*, pollard*, coppic*
Outcomes: *diversity, species AND (richness OR focal OR target OR keystone OR umbrella OR red-list* OR threatened OR endangered OR rare), “species density”, “number of species”, indicator*, abundance, “dead wood”, “woody debris”, “woody material”, “forest structure”, habitat*
The terms within each of the categories above (‘subject’, ‘forest type’, ‘intervention’ and ‘outcomes’) were combined using the Boolean operator ‘OR’. The four categories were then combined using the Boolean operator ‘AND’. An asterisk (*) is a ‘wildcard’ that represents any group of characters, including no character.
The ‘forest type’ category of search terms was included in order to keep the number of articles at a manageable level—without these terms, the amount of literature to be screened would have increased about fourfold. The ‘forest type’ search terms were chosen to optimise the likelihood of finding relevant studies in Sweden or in forests elsewhere that are dominated by tree species commonly occurring in Sweden. However, the terms were also judged to be capable of identifying a satisfactory share of relevant studies carried out in other boreal and temperate forest types throughout the world.
At some of the websites mentioned below, searches were also made for relevant literature in Finnish, French, German, Russian and Swedish, using search terms in these languages. A translation of the full English search string was used when French literature was searched for in publication databases. In other cases, the selection of search terms had to be reduced and customised to individual websites, since few of these accept long and complex search strings and some of the English terms could not be translated to other languages.
About 10 months after the main searches for literature in English, an update was made using Web of Science and Google Scholar.
Full details of the search strings used for each search are recorded in Additional file 2, together with search dates and the number of articles found.
The search utilised the following online publication databases:
GeoBase + GeoRef
JOSKU (University of Eastern Finland library)
eLIBRARY.ru (Hayчнaя элeктpoннaя библиoтeкa)
Science Citation Index
Web of Science
Wiley Online Library
The main searches for literature in English were made with the preliminary search string in ten of these databases. Supplementary searches for English literature using the additional search terms in the final search string were made in Academic Search, Scopus and Web of Science (except for one addition, “brows*”, which was searched for in Web of Science only). Literature in Finnish, French, Russian and Swedish was searched for in subsets of the publication databases listed above (see Additional file 2 for details).
Internet searches were performed using the following search engines:
Google Scholar (http://scholar.google.com)
In most cases, the first 200 hits (sorted by relevance) were examined for appropriateness.
Websites of the specialist organisations listed below were searched for links or references to relevant publications and data, including grey literature.
Other literature searches
As a check of the comprehensiveness of our searches, relevant articles and reports were also searched for in literature reviews. Moreover, each member of the review team used national and international contacts to get information on current research related to the topic of the review, and also to find non-peer-reviewed literature, including reports and theses published in e.g. Swedish, Finnish, Estonian or Russian.
Screening of literature
Articles found by searches in publication databases were evaluated for inclusion at three successive levels. First, they were assessed by title by a single reviewer (primarily CB or JS). In cases of uncertainty, the reviewer chose inclusion rather than exclusion. As a check of consistency, a subset of 100 titles was assessed by both of the primary reviewers and also by four other members of the review team (BGJ, KJ, AL and JM). Of the 76 titles in this subset that had been excluded by one of the primary reviewers (or both), 69 were also excluded by at least two of the additional reviewers. Four of the remaining seven titles were excluded by only one of the additional reviewers, and three titles by none of them. After discussions and agreements on whether to include or exclude certain borderline topics that had been identified by this exercise, the title screening was allowed to continue.
Next, each article found to be potentially relevant on the basis of title was assessed for inclusion on the basis of abstract, again by a single reviewer (CB, JS or BGJ) who in cases of uncertainty tended towards inclusion. At an early stage, a subset consisting of 100 abstracts was assessed by all three reviewers involved in this part of the screening process, and the consistency of their assessments was checked with kappa tests. The outcomes ranged between κ = 0.50 (CB vs. JS) and κ = 0.78 (CB vs. BGJ), indicating ‘moderate’ to ‘substantial’ agreement . Discussion of the discrepancies between the primary reviewers (CB and JS) resulted in additional specifications of how the inclusion criteria were to be interpreted. When a second subset of 100 abstracts was screened by the two primary reviewers, the kappa statistic relating to their assessments was found to be 0.63, indicating ‘substantial’ agreement .
Finally, each article categorised as potentially relevant on the basis of abstract was assessed for inclusion by one reviewer who studied the full text. This task was shared by all members of the review team. The articles were randomly distributed within the team, but some redistribution was made to avoid having reviewers assess studies authored by themselves or articles written in an unfamiliar language. Articles found using search engines, specialist websites, literature reviews or stakeholder contacts were entered at this stage in the screening process.
Almost 90 % of the full-text assessments were double-checked by a second reviewer (primarily CB). Where the first and second reviewers disagreed on whether to include a study or not, they discussed and reconciled their assessments on a case-by-case basis. Certain categories of studies identified as doubtful during this stage of the screening were discussed by the entire team. Based on these discussions, some of the inclusion criteria were specified further.
Study inclusion criteria
Each study had to pass each of the following criteria in order to be included:
Any habitat with a tree layer was regarded as forest, which meant that studies of e.g. wooded meadows and urban woodlands could be included.
As an approximation of the boreal and temperate vegetation zones we used the cold Köppen-Geiger climate zones (the D zones) and some of the temperate ones (Cfb, Cfc and Csb), as defined by Peel et al.  (see Fig. 1). The other temperate Köppen-Geiger climate zones are often referred to as subtropical and were therefore considered to fall outside the scope of this systematic map.
Nevertheless, forest stands dominated by ponderosa pine (Pinus ponderosa) were considered as relevant even if located outside the climate zones mentioned above. These forests constitute a well-studied North American habitat type that shares several characteristics with the pine forests in boreal and temperate regions.
Removal of woody understorey or ground-layer vegetation
Removal or addition of litter or humus
Creation of dead wood
Addition (translocation) of dead wood
Exclusion or other deliberate manipulation of wild cervids and similar grazers/browsers
Livestock grazing and traditional mowing, coppicing and pollarding
Underplanting of trees and (re)introduction of native non-tree species
Control of exotic and/or invasive species
Liming and use of herbicides, if the primary goal was conservation
Clearcutting was not included, since we did not find this intervention useful for the conservation of forest biodiversity (although we admit that clearing of an established stand may be relevant in very specific cases, e.g. when the aim is to substitute a plantation with an alternative forest type). We did, however, include coppicing, because this is in many regions a traditional forest management system with specific biodiversity values worth maintaining. Pollarding (a traditional harvesting technique that affects large trees across entire stands) was included for similar reasons, but not other kinds of pruning that are applied in gardening and for managing single trees, often for aesthetic reasons (see Fig. 2).
Studies of partial harvesting were not included if less than 25 % of the volume or basal area of living and dead trees was retained, or if the intervention consisted of gap felling with an average gap size exceeding 0.5 ha. Existing meta-analyses have concluded that harvested stands start to function as clearcuts (from a biodiversity point of view) when the retention level drops to somewhere between 15 and 40 % [46, 47]. Nevertheless, studies of 25–50 % retention levels may provide some conservation insights, e.g. into the possibilities of combining management for forest biodiversity with management for wooded-grassland diversity and/or for species favoured by disturbances. The threshold we chose for gap sizes was based on the FAO definition of forest as land with a certain minimum tree cover and an area of more 0.5 ha —hence we considered gaps larger than 0.5 ha as clearcuts.
When in doubt about the relevance of interventions intended to benefit particular species (notably tree species), we generally included or excluded studies based on whether study authors described the interventions as being made for the purpose of conservation or not.
Several of the stakeholders that we consulted when developing the protocol  suggested that studies of wildfires should be included, but we decided not to do so. Wildfire is usually not a management option, although it may be possible to choose whether to suppress a fire or not. Moreover, while there is an extensive literature on the effects of unplanned and uncontrolled fires (e.g. [49, 50]), their consequences for biodiversity cannot be assumed to be identical to those of prescribed burning. We judged that including only studies of prescribed burning was appropriate for the purposes of this review.
Both temporal and spatial comparisons of how different kinds of forest management affected biodiversity were considered to be relevant. This means that we included both ‘BA’ (Before/After) studies, i.e. comparisons of the same site prior to and following an intervention, and ‘CI’ (Control/Impact) studies, i.e. comparisons of treated and untreated sites (or sites that had been subject to different kinds of treatment). Studies combining these types of comparison, i.e. those with a ‘BACI’ (Before/After/Control/Impact) design, were also included.
Most CI and BACI studies that are relevant to the subject of this systematic map compare different forest stands or different parts of a single stand. However, studies of how creation or addition of dead wood affects biodiversity may be based on comparisons of individual trees (logs or snags) that have been subject to different treatments (e.g. girdling vs. other ways of killing trees), and we included such comparisons as well.
Moreover, we found a number of seemingly useful dead-wood studies that did not compare effects of different kinds of intervention but were based on other types of comparison instead, and we therefore decided to extend the comparator criterion by also including studies of the three following categories:
Studies comparing biodiversity effects of dead-wood creation/addition in different kinds of forest stands (e.g. stands of different age or stands subject to different kinds of management).
Studies comparing biodiversity effects of creation/addition of different kinds of dead wood (e.g. wood of different species or sizes).
Studies comparing biodiversity aspects of created/added vs. naturally occurring dead wood.
The following types of outcome were considered to be relevant:
Abundance of single species or taxonomic or functional groups of terrestrial organisms (including the soil seed bank)
Species richness, diversity index and composition of taxonomic or functional groups of terrestrial organisms (including the soil seed bank)
Performance and population viability of target species
Abundance and diversity of dead wood
Stand structure (horizontal and/or vertical distribution of trees)
Occurrence of tree microhabitats (e.g. cavities)
Based on this criterion, we excluded e.g. simulation studies, review papers and policy discussions.
Language: Full text written in English, French, German, Danish, Norwegian, Swedish, Finnish, Estonian or Russian.
During the screening process, we sometimes found it necessary to specify the inclusion criteria further by deciding whether to include or exclude certain borderline topics or study categories, based on their relevance to conservation or restoration. The final set of criteria, including all specifications, is listed in Additional file 3.
Study quality assessment
No quality appraisal was made of studies subsequent to their inclusion in the review, since this is not considered necessary for the purposes of a systematic map . Nevertheless, the screening for relevance described above did involve certain quality aspects. Since we required studies to present ‘useful’ data on interventions, we excluded investigations of the effects of silvicultural systems (such as ‘uneven-aged’ or ‘near-natural’ forestry) if they provided insufficient information on how the forest had been managed, e.g. no data on the specific interventions on which these kinds of forestry were based. Similarly, since comparators were also required to be ‘useful’, we excluded studies where the ages or species compositions of treated and untreated stands were entirely different (e.g. studies of interventions in young plantations where mature or old-growth stands were used as controls).
If studies included in the map are later selected for full systematic review, they will have to undergo full critical appraisal. The data on study design that are provided in the map may be relevant when such an assessment is made. For instance, studies with a CI or BACI design are likely to be more useful than BA studies in the context of forest management. This is because a forest set-aside that has been subject to some kind of active management may also be affected by other influences (e.g. changes in weather, climate or atmospheric pollution, or ecological succession following earlier land-use changes). Such influences can be controlled for in CI and BACI studies, but not in BA studies. On the other hand, it should be noted that CI studies can be misleading if confounding differences between treated and untreated sites (due e.g. to interventions in the past) are not known and described well enough. Other relevant quality aspects of the study design include the size of treatment/intervention units and the degree of replication. Such aspects may well be taken into account in a full systematic review, but they have not been used as criteria for exclusion in the present systematic map.
Systematic map database
The database that constitutes the core of this systematic map provides basic information on each study found to be relevant. This information is available in an Excel file (Additional file 4), and also in an interactive GIS application. The GIS application plots study locations on a zoomable world map, and data on the studies can be retrieved by clicking on the symbols in the map. The application also provides a table with the same content as the Excel file. Both the GIS application and the Excel file allow data to be filtered and sorted.
Each included study is described and categorised based on the following types of data (to the extent that they were available):
location of study area (country, state/province, region or site(s), geographical coordinates, altitude),
research programme to which the study belonged,
forest type (coniferous/mixed/deciduous),
dominant tree species,
type of comparison (BA/CI/BACI),
number of true replicates,
intervention type(s) categorised using codes listed in Additional file 4,
intervention(s) specified using free text,
outcome type(s) categorised using codes listed in Additional file 4,
focal species, communities and/or biodiversity indicators.
In addition, the database contains links that search Google Scholar for the title of each included article. They will return links to abstracts and full-text versions of the articles if these are available through Google Scholar.
Descriptions recorded in the database were normally extracted from the included articles, but if no geographical coordinates were given, we recorded approximate coordinates based on published site names, maps or verbal descriptions of study locations (or coordinates provided in another article describing the same site). Not uncommonly, moreover, coordinates given by study authors were clearly incorrect (e.g. confusing minutes of arc with decimals of degrees, or confusing latitude and longitude with coordinates based on national grid systems). In such cases, too, we recorded coordinates based on other information.
In cases where some of the data reported by a study fell outside the scope of our review (e.g. where some of the study sites were located outside relevant vegetation zones), we recorded information only on those parts of the study that fulfilled our inclusion criteria.
The number of true replicates recorded in the database was strictly based on the extent to which the intervention was replicated, regardless of the scale of the intervention (and even if study authors stated that they had avoided pseudoreplication by spacing sampling sites widely enough). For instance, studies of exclusion from grazing were considered to be non-replicated if they were based on one exclosure only, even if the exclosure was large and contained many sampling sites. If treated sites and controls were not replicated to the same extent, we always recorded the lowest number of replicates.
The first round of data recording was shared by all members of the team. Two of us (CB and JS) then added some supplementary data, mainly on locations of study areas. Finally, one reviewer (CB) double-checked all entries in the map database for consistency.