Background

The European Commission proposed in its recent Biodiversity Strategy to maintain and enhance European ecosystems and their services by 2020 by establishing green infrastructure (GI) and restoring at least 15% of degraded ecosystems [1]. The package of actions designed to respond to this challenge included the need to ensure no net loss of biodiversity and ecosystem services by EU-funded projects, priority setting regarding restoration, and promoting the use of GI [1]. GI is defined as the network of natural and semi-natural areas, features and green spaces in rural and urban, terrestrial, freshwater, coastal and marine areas [2]. This includes for instance areas of high nature value such as protected areas, floodplains, wetlands and natural forests, natural landscape features that can act as corridors for wildlife, artificial features such as eco-ducts or eco-bridges, and multifunctional zones where land uses are favoured that help maintain or restore healthy biodiverse ecosystems [3, 4]. The European Commission emphasizes the ability of GI to perform multiple functions in the same spatial area, thus sustaining a range of benefits by delivering multiple ecosystem services (ESS) such as air and water purification and climate regulation [5, 6]. ESS represent the benefits human populations derive, directly or indirectly, from ecosystem functions [7], and both functions and benefits might be affected through interventions, such as reconnection of natural areas and improvement of overall ecological quality of the countryside. A combination of the delivery of multiple ESS including the conservation of biodiversity could lead to win-win situations and thus present an efficient way of achieving long-term nature conservation [8]. Knowledge generation to promote understanding of such situations is a current research priority in conservation biology, applied ecology, and environmental sciences [9, 10]. Within this systematic review we will assess multifunctionality in terms of ESS that are affected by an implemented intervention.

Floodplains develop adjacent to river channels and can be described as low-relief Earth surfaces composed of fluvial deposits [11, 12] that are frequently flooded (active floodplains) or formerly flooded (morphological floodplains) and are an integral part of catchments [13]. While hosting important natural assets and high levels of biodiversity [1416], they have been used since ancient times by human populations, who attempted to maximize the benefits they gained by interventions such as irrigation channels and dikes [17]. In many parts of the world, human activities have altered the landscape and disrupted fluvial processes to the extent that floodplains are among the world´s most threatened ecosystems [1820]. Floodplains are good examples for multifunctional landscapes and GI and their management requires close coordination among agriculture, water use, hydrological engineering, mineral extraction, energy production, nature conservation and spatial planning [21] and poses multi-dimensional challenges to policy-makers and project managers [22]. Flood protection is particularly important in light of an increasing frequency and amplitude of flood events throughout Europe, resulting in casualties and damage [23, 24]. Restoration of a river and its adjacent floodplain might generate many benefits for nature and society, including alternative economic activities, improved flood prevention, richer biodiversity and aesthetically appealing landscapes and particular recreational opportunities. However, information on implementation and outcomes of such projects is often inaccessible [25].

Evidence for biodiversity effects of the GI approach and particularly of multifunctional floodplain management is scattered and has not been synthesised [21]. This issue is of particular relevance for large lowland floodplains, where due to high human population densities a variety of ecosystem services are in demand while at the same time floodplain biodiversity is driven by dynamic biophysical processes and feedback mechanisms over broad spatial and temporal scales [13, 17]. As climate is an important factor for ecological processes, floodplains situated in climates comparable to those occurring in Europe are of particular relevance for this review that aims to support European decision-making. Floodplain interventions are very diverse [26] and in this scientific review we will hierarchically categorize the encountered interventions with respect to their main aims and effects. The interventions also differ strongly regarding the frequency of their implementation and the degree to which their impact on biodiversity has been assessed or results published in accessible formats [25]. This must be considered when interpreting the results of this review. The level of multifunctionality of interventions can be assessed in terms of their effects on ESS. For instance, several restoration measures aiming at a dynamic habitat mosaic are supposed to additionally increase the provision of ESS, such as water purification and lifecycle maintenance, habitat and gene pool protection [13]. Suitable indicators of biodiversity include measures such as the diversity or abundance of species, taxonomic or functional groups [2730]. The effects of the floodplain management measures on biodiversity will be prone to several factors, the most obvious being the considered taxa and the time since intervention. Floodplain management measures can have very different effects on different taxa, for instance, a water enhancement scheme for the Danube floodplain within the city limits of Vienna showed positive effects on dragonflies and molluscs, while no significant impact was observed for fish [31]. Time since intervention is a crucial parameter, and depending on several factors, such as availability of propagules for population establishment, an intervention might show its effects only after a considerable time span [32].

Objective of the review

In this systematic review we aim to synthesise evidence in response to a two-part primary question dealing with the effects of multifunctional floodplain management on biodiversity. We will further assess three secondary questions dealing with the main causes of heterogeneity in patterns detected.

Primary question

What is the impact of floodplain management measures on biodiversity and how does the impact vary according to the level of multifunctionality of the measures?

The question contains the following components:

Population: floodplains and rivers, including all ecosystems that are located in the morphological floodplain and linked to the hydrological regime of the river.

Intervention: floodplain management measures, commonly related to production and transport (e.g. water or mineral extraction, navigational infrastructure), water regulation and flood protection, conservation and restoration as well as recreation activities (see Methods section for further examples).

Comparator: the previous state of the floodplain before the implementation of the intervention, the original natural state of the floodplain, or the state of the floodplain after another kind of intervention.

Outcome: change in biodiversity indicators (diversity and abundance indicators of species or other groups of organisms).

Secondary questions

  1. a)

    How does the biodiversity impact of floodplain management differ across taxa?

  2. b)

    What is the effect of the time since implementation on the impact of floodplain management measures?

  3. c)

    Which other factors significantly modify the biodiversity impact of floodplain management measures?

Methods

Searches

Database search terms and languages

Three categories of search terms will be applied, corresponding to the categories of the questions, i.e. population, intervention and outcome (Tables  1, 2 and 3). The comparator will not be included for the search itself but as an inclusion criterion. We aim to perform the search in the two main databases for scientific literature, i.e. Scopus and Thomson Reuters Web of Knowledge (formerly ISI Web of Knowledge). The main search terms for each category will be complemented by alternative terms deemed by the review team to have similar significance given the terms have been applied in several key papers [26, 3335]. Among the three categories, the terms will be linked with the Boolean operator ‘AND’. Within the three categories, the terms will be linked with the Boolean operator ‘OR’. In the “outcome-group”, the main search term “biodiversity” will be complemented by a combination of (i) any of the four terms “diversity”, “richness”, “abundance”, and “density” AND (ii) any of many alternative terms for “species”, such as “genus”, “taxon”, “plant”, “tree”, “bird”, “insect”, “macrozoobenthos”, etc. (Table  3). To be considered, studies will have to contain one term for each of the three categories in either title, keywords and abstract or topic for the Scopus or Thomson Reuters Web of Knowledge databases, respectively.

Table 1 Search terms for the population “floodplains”
Table 2 Search terms for the intervention “floodplain management”
Table 3 Search terms for the outcome “biodiversity” following the formula “biodivers* OR (group1 AND group2)”

Thus the total search string will have the following structure:

(Population-Term-1 OR Population-Term-2 OR … OR Population-Term-n) AND

(Intervention-Term-1 OR Intervention-Term-2 OR … OR Intervention-Term-n) AND

(biodiversity OR ((diversity OR richness OR abundance OR density) AND (Outcome-Term-1 OR Outcome-Term-2 OR … OR Outcome-Term-n)))

While the search terms have been developed and will be applied in the English language only, non-English documents returned by these English search terms will be included in the systematic review. No time and document type restrictions will be applied.

Grey literature

We will cover a representative share of European grey literature by a complementary expert assessment. Selected experts from a broad range of European countries will synthesize personal expertise and grey literature for their specific country following a template to specify i.a. the role of multifunctionality in floodplain management and evidence for effects of multifunctional floodplain management approaches on biodiversity. Other ways of dealing with grey literature such as searches in Google Scholar and retrieving a limited number of hits (e.g. 50) as proposed by CEBC [36], seem to be less adequate for our purposes. These seemingly systematic procedures would produce a highly arbitrary selection, because (i) of the breadth of the topic (e.g. all floodplain management interventions, all taxa), (ii) the need for a simple search string, (iii) much relevant grey literature on the topic is written in non-English languages, (iv) much information was never adequately published, partly because commissioned studies were kept confidential or because they are part of larger and on-going floodplain management activities. The complementary expert assessment is almost completed at the time of compiling this protocol and will be published before the systematic review is written up. Consistency and differences of the findings of the two processes will be discussed in the discussion section of the systematic review.

Literature provided directly by stakeholders

Stakeholders were asked beforehand to provide literature. This literature was used to establish this systematic review protocol, but will also be considered for the definitive review. It will be reported, how many of the papers provided by the stakeholders overlap with those of the systematic search and how many of them were deemed suitable for assessment when applying the inclusion criteria.

Comprehensiveness and effectiveness of the database search

We tested the comprehensiveness of the search string in the following way: (i) we agreed on a short list of 6 expressions to be included regarding the population; (ii) we established extensive lists of 86 and 72 alternative terms for intervention and outcome, respectively; (iii) we evaluated the overall hits of the full query in the Scopus database; (iv) we evaluated the specific additional hits provided by each of the intervention-terms while keeping outcome constant; (v) we ranked the intervention terms according to their number of specific additional hits and assessed their cumulative hits by adding them one by one according to their relevance; and (vi) we repeated the last two steps for the outcome-terms while keeping the intervention-terms constant.

Consequently, we found that 37 of 86 terms for intervention (43%) did not yield any specific additional hits, as was the case for 45 of 72 terms for outcome (62.5%). Due to the long and flat plateaus of the saturation curves (Figure  1), we assume that our search string will adequately cover the relative literature. Thus, we will not search bibliographies of selected papers for potential additional literature except for identifiable review articles falling under the scope of this study, which will be searched for relevant primary studies. These primary studies detected in review articles will be treated in the same way as those identified directly by the search strings. The high proportion of alternative terms yielding zero or few additional hits might potentially be caused by having chosen the wrong terms, but this can be considered as highly improbable, because much literature was screened and many experts on the topic have been involved in the compilation of the lists.

Figure 1
figure 1

Saturation curves of the test searches for (a) intervention and (b) outcome. Full lines specify the cumulative hits when adding search terms one by one, dotted lines represent the specific additional hits for each alternative term. Terms were ranked according to their number of additional hits. These evaluations were performed with the Scopus database, keeping constant the search terms for the other aspect, respectively, and for “population”.

To ensure the review is as comprehensive as possible we opted to keep the alternative terms that did not result in any additional hits in our search string, as keeping them will not require any further effort, but they might yield hits when using other combinations of terms, being translated to other languages, searching other databases or when searching in the future.

Study inclusion criteria and study collection

Articles identified by the search strategy will be filtered during a process consisting of three steps. First the inclusion criteria listed below will be applied to the titles of the studies. Titles often provide enough information (e.g. regarding the population or the geographical location) to clearly recognize incongruous articles which can subsequently be removed. The remaining articles will be filtered by viewing the abstract followed by the full text. Incongruity might occur and be detected in any of the three stages, because a study may obviously not match with the population (e.g. because it concerns a different kind of water body or does not match geographically or climatically), the intervention or the comparator (e.g. no intervention takes place or no comparator is used, while instead the study might describe the ecological status of a floodplain and recommend management measures), or will focus on different outcomes (e.g. geomorphology, water dynamics). If there is insufficient information to exclude a study, it will be kept in the database until the next stage.

To assess and limit the effects of between-reviewer differences in determining relevance, two reviewers will apply the inclusion criteria to a set of randomly chosen articles at the start of the abstract filtering stage. The kappa statistic [37] will be calculated, which measures the level of agreement between reviewers. If kappa is less than 0.6, the reviewers will discuss the discrepancies and clarify the interpretation of the inclusion criteria. This may entail a modification in the criteria specification. After this discussion, the reviewers will apply the inclusion criteria to the remaining articles. Studies reported in articles must achieve the following criteria to be included in the review and used for data extraction.

Relevant population

Floodplains including all ecosystems that are located in the morphological floodplain and linked to the hydrological regime of the river (e.g. rivers, oxbows, floodplain forests, flood-meadows, paddy fields) will be considered. Our focus is on large lowland floodplains and we excluded headwater streams (Strahler’s river order ≤ 3) and their floodplains for the purpose of this study. All other kinds of wetlands, such as lakes, estuaries, deltas and tidal flats, peatlands and fishponds [38] will not be considered.

We focus on environmental conditions that prevail in Europe, because this systematic review aims to support European decision-making. Evidence might come from other continents, but the environmental conditions should be similar to those in Europe. For this purpose, this systematic review will be limited geographically to the areas in both northern and southern hemispheres lying between the tropic and the polar circle, i.e. between 23° 26′ 22″ and 66° 33′ 39″, and climatically to the following Köppen-Geiger climate classes [39]: (i) “Dfc – Snow/fully humid/cool summer”, (ii) “Dfb – Snow/fully humid/warm summer”, (iii) “Dfa – Snow/fully humid/hot summer”, (iv) “Cfb – warm temperate/fully humid/warm summer”, (v) “Cfa – warm temperate/fully humid/hot summer”, (vi) “Csb – warm temperate/summer dry/warm summer”, and (vii) “Csa – warm temperate/summer dry/hot summer”.

Types of intervention

All types of intervention related to floodplain management will be considered. Such interventions are commonly related to production and transport, hydrological engineering and flood protection, conservation and restoration or recreation. Specific examples are for instance water extraction, navigational infrastructure, construction of dikes, construction of detention basins, removal of bank fixation, lowering of entrenchment depth, wood placement, installation of flow deflectors, elongation of river length, creating a new water course or multiple channels, extensification of land use and the re-connection of backwaters [26].

Types of comparator

We will only include studies that use comparators, and have identified the following three types when the outcome of interventions related to floodplain management is compared to (i) the previous state before the implementation of the intervention (e.g. [26]), (ii) the original natural state of the floodplain (mainly when assessing the performance of restoration measures, e.g. [40]) or (iii) to the state of a comparable floodplain after the implementation of another kind of intervention (e.g. [41]). Additional heterogeneity in the application of comparators in the primary studies will be caused by the different kinds of study designs (see section “Study quality assessment”).

Types of outcomes

To be included, a study must assess the impact on biodiversity. As biodiversity (which implies the entire genetic, species and habitat diversity of an area) cannot be assessed directly, studies will use indicators of biodiversity. In this review, we will consider studies that assess impact on biodiversity expressed by indicators related to diversity or abundance of groups of organisms, such as species, other taxa (e.g. genus, families, subspecies), guilds (e.g. forest birds, rheophile fish), and functional or morphological groups (e.g. shredders, shrubs, macroinvertebrates) [2730]. Studies that assess genetic and habitat diversity are also relevant, but will be excluded from the study and should be covered by future systematic reviews.

The indicators related to the diversity of groups of organisms include “diversity”, which is commonly measured by diversity indices such as the Simpson or the Shannon Diversity Index, “richness”, i.e. the number of species, “density”, i.e. the number of species per spatial unit, and “evenness”, i.e. evenness in number of individuals of each species in the area [42]. The indicators related to the abundance of groups of organisms include measures of abundance and density of specimens [43].

In the frame of this systematic review, we will evaluate for each relevant analyses encountered in a study (hereafter called “case”), whether the groups of organisms considered are specialists related to river dynamics and natural floodplain habitats and classify them accordingly during data extraction and for the synthesis.

Types of studies

We will include all kind of studies containing primary data about the impact of floodplain management on biodiversity (see also section “Study quality assessment”).

Potential effect modifiers and reasons for heterogeneity

As we are tackling a broad topic, plenty of effect modifiers and reasons of heterogeneity are anticipated. We will extract several items of relevant information from the studies:

  • -) General study parameters: country, longitude, latitude, altitude, geographic zone, biogeographic realm, biome [44], Köppen-Geiger climate classes [40], investigated environment (artificial surfaces / agricultural areas / forests / wetlands, semiaquatic, mixed and others (including flooded meadows) / water), years of data collection, Strahler stream order, spatial extent of the study area, naturalness of the study area [45];

  • -) Methodological variables: the kind of intervention, time since implementation of the measure, study design (cf. Table 4), number of replicates of biodiversity plots per sampling site, sampling method, kingdom (animalia, plantae, fungi, protista, bacteria) and finer taxonomic categories (including functional groups), the size of the species pool (i.e. the number of potentially present species), outcome measure used (species richness, species diversity, etc.), statistical method applied.

Table 4 Scoring sheet for study quality assessment

Study quality assessment

Study quality assessment is required to add quality covariates to the analyses. Reviewers will assess the methodologies used in all articles accepted at full text. The quality assessment will be based on an evaluation of the following five criteria: (i) study design and repetitions, (ii) appropriateness of methods including statistics, and coverage in terms of spatial and temporal scale, (iii) intervention, intra-treatment variation, and confounding factors, (iv) baseline comparison, and (v) reliability of the study including presentation of consistency of methods and results, and missing values. Study quality will be scored following a hierarchy of evidence based on susceptibility to bias [4648]. The particular system developed for the purpose of this review was adapted from the study quality assessment implemented by Stewart et al. [49]. Each criterion will be scored by the reviewer, and complemented by a short text specifying the reasons for the scoring. For example, a standardised study design like the BACI (Before/After/Control/Impact) type [50] would be of higher quality than a simpler design applying only spatial but not temporal control. The maximum overall score will equal 100 points ( Table 4). The scoring might be different for each ‘case’ of analyses detected in a research paper, as it might be that sampling effort varies across considered taxa, or that primary analyses and results are presented incompletely for some cases. In the following, specifications of quality issues are presented for each of the five criteria:

  1. (i.)

    Study design and repetitions are crucial aspects that determine the study results susceptibility to bias, robustness, explanatory power and generalizability [51]. Scoring will follow a scheme that considers study design expressed in temporal and spatial repetitions ( Table 4).

  2. (ii.)

    Appropriateness of methodology, and spatial and temporal coverage: appropriate sampling methods and statistical approaches are required to make best and unbiased use of information gathered. Validity and relevance of study results depends on the appropriateness of methods used and on the appropriate coverage in terms of the spatial and temporal scale of the study.

  3. (iii.)

    Intervention, intra-treatment variation, and confounding factors: interventions might be badly specified or many different measures might be treated as ‘interventions’ and compared to control sites. Other confounding factors might lead to the conclusion that the study results might be prone to bias or error.

  4. (iv.)

    Baseline comparison: in environmental sciences many studies might be confounded in terms of the baseline case selected, because the control sites are too different in regards their ecology or because they had been sampled at a large spatial or temporal distance or even with a different sampling protocol compared to the sampling units subject to interventions.

  5. (v.)

    Presentation of methods and results, reliability, and missing values: it is impossible to know the rigor that was implemented during all stages of a primary study. However, clarity and thoroughness of the presentation of methods and results might indicate overall scientific rigor and reduce the probability of wrong interpretations by the reviewer. Errors might occur during all stages of a study and confounding statements or very unreliable results in tables and figures that are not mentioned in the text or explained in the discussion, might indicate flaws in data processing or reasoning. Missing results for specific cases can lead to directional bias, for instance when only significant results are reported [52].

Data extraction strategy

Data will be extracted from each article and recorded in a spread sheet. One article can contain several cases of valid and relevant analyses and all of them will be extracted in different spread sheet rows. Data to be extracted will include the intervention and its level of multifunctionality, the outcomes, the methodology and other potentially confounding factors that have been identified as possible reasons for heterogeneity in the primary studies (see above Potential effect modifiers and reasons for heterogeneity).

A major issue in this systematic review is the assessment of whether and how the biodiversity impact of the interventions varies according to their level of multifunctionality. As the multifunctionality of the intervention is not directly obtainable from the primary literature, we will assess the level of multifunctionality for all important interventions based on their average effects on ESS provision. Each intervention might have either a positive, a negative or no influence on the provision of a specific ESS. The matrix concerning this matter will mainly be based on expert evaluations during workshops and teleconferences complemented by relevant information from literature sources. We will also consider ESS that might be related to ‘secondary functions’ or ‘co-benefits’ (sensu [53]). For the ESS classification, the “Mapping and Assessment of Ecosystem Services (MAES)”-scheme will be applied, which is based on the CICES classification [54] and has recently delivered its first applicable results [55]. We will consider 21 ESS and calculate for each intervention a multifunctionality index that equals the difference of the number of positively and negatively affected ESS divided by the overall numbers of considered ESS. This index will range between −1 (all ESS negatively affected) and +1 (all ESS positively affected) and interventions with positive values are supposed to increase the level of multifunctionality.

A further important issue is the extraction strategy related to the outcome, i.e. the biodiversity indicators, and we will evaluate for each case, whether the species of an assessed group of organisms are typical for and native to natural floodplain ecosystems.

Data extraction forms will be piloted on a purposive sample of the articles, to represent the range of articles available, and amended if necessary to improve repeatability and efficiency. For most study designs, we expect to extract F, R, R2 values as well as p-values, sample sizes and degrees of freedom. Special care will be taken with regards potential publication bias that occurs when only significant results are presented in a paper that contains several kinds of analyses (e.g. related to subtaxa, subareas). Missing data for the most important issues (e.g. statistics, sample sizes, degrees of freedom) will be calculated or inferred where possible from the summary statistics presented: if not possible the authors will be contacted. Missing data regarding some of the covariates (altitude, years of data collection, Strahler stream order, etc.) will be researched, after being considered as relevant in the meeting of the stakeholder group.

Data synthesis and presentation

Initially a narrative synthesis of the data will be elaborated, and extracted cases will be grouped into hierarchical categories by intervention, also considering types of comparators, taxa, time since intervention and study quality. The exact categories will depend on the quality and type of data retrieved during the data extraction stage. One focus of the analyses will be on the evaluation of differences in effect size among established intervention types with apparent promise in a European context given their frequency of implementation and evidence from published accounts. The potential influence of the level of multifunctionality associated with different interventions will also be assessed. Additionally, we will test for the effects of the main covariates such as taxonomic kingdom, time since intervention, and habitat investigated. If extracted data are suitable for quantitative synthesis, we will aim to calculate effect sizes and carry out a meta-analysis [56, 57]. Sensitivity analysis will be run to explore the effects of including studies with different designs and methodological quality. We will consider the different comparators in different analyses, as effect size has a totally different (even opposite) meaning when the effect of an intervention is compared to a previous unrestored situation or to the situation of a natural remnant. We will limit our analyses in the first instance to cases dealing with specialist floodplain species, and test later whether the same pattern can be detected for generalist species. Non-native species will be analysed separately, if the number of cases is high enough to enable a quantitative analyses.

If insufficient data are extracted, data are mainly of low methodological quality, or if the literature is too heterogeneous in regards to the interventions, we will limit our summary to a narrative synthesis and present the outcomes in tables and eventually systematic knowledge maps. Outcomes from addressing both the primary and secondary questions posed here will be discussed with selected stakeholder groups and implications for multifunctional floodplain management in Europe considered.