Our methods describe a protocol that we will use for a systematic review of the management of nine priority invasive alien plant species in England and Wales. For the purposes of this protocol, an invasive alien plant is defined as a species that is considered to be outside of its native biogeographic range, that has spread as a result of human activity and causes negative ecological or economic impacts . Because management interventions will be similar for different invasive species, this approach will allow us to answer broader scale questions about the management of invasive plant species. Our work will initially focus on global evidence about the eight invasive plant species identified as priorities by Defra in England and Wales, and one other species we have identified as a priority, but we envisage that similar protocols could be used to investigate invasive plant management of any species in any geographic region.
Our systematic review will make use of a new website for screening literature, extracting data, and presenting dynamic meta-analyses (http://www.metadataset.com) . We aim to use this website to host multiple systematic reviews, using the current systematic review as a proof-of-concept. At present the website allows reviewers to screen papers, use automated kappa analysis, and enter study data. In addition, the website allows users to browse the included publications by management intervention, outcome, or country. In the future, users will be able to browse the data extracted from these publications and interact with meta-analyses, by selecting subsets of the data (e.g. data from selected countries or selected management interventions). We welcome potential collaborators to please contact us.
The methods for this systematic review follow the Collaboration for Environmental Evidence guidelines for evidence synthesis , and a ROSES (RepOrting standards for Systematic Evidence Synthesis) checklist  has been completed (Additional file 1).
The protocol has been developed as part of the BioRISC (Biosecurity Research Initiative at St Catharine’s) programme. This programme aims to build, integrate, and synthesise evidence, across the different domains of biosecurity, including invasive alien species. We have discussed this protocol with members of the BioRISC team and the Conservation Evidence group at the University of Cambridge . The authors of this paper include experts in terrestrial and aquatic invasive species (BG, JMB, DA) and we consulted with one other academic expert. Although we did not engage directly with Defra, this protocol has been developed in response to their consultation on the management of priority invasive alien species in England and Wales. A previous systematic map protocol published in Environmental Evidence was used as the basis for this protocol .
Searching for articles
Our search strategy is designed to retrieve all the publications on each invasive plant species of interest. We will search for publications in four bibliographic databases that were identified as relevant by the review team. In addition to published, peer-reviewed literature, we will also search for unpublished research to minimize the risk of publication bias. Information on the studies screened will be held on the Metadataset website. We will also allow people to submit any relevant papers that they feel we have missed during our searches and we will subsequently screen these using the methodology outlined in this protocol.
For each species, we will use its scientific name(s) and English common name(s) as the search string, based on a standard list of taxonomic synonyms , and details of all of these search strings are given in Table 1. These search strings should produce the most inclusive set of search results possible, because it will not use “AND” terms (e.g. ‘“Impatiens glandulifera” AND control’). Where Boolean search terms (e.g. “OR” or “AND”) are not supported by databases (e.g. Google Scholar, which only allows basic Boolean searching) we will modify our search strategy to only use the most commonly used scientific name. This search strategy is likely to be very comprehensive as it will find all available literature for the species of interest. It also has the advantage of making searches relatively easy to update, as well as allowing for changes in taxonomy to be dealt with easily in the future. Because our search string will have high ‘sensitivity’ (i.e. a high proportion of all the publications about each invasive alien species) we expect our search strategy to be very comprehensive and therefore the need for a benchmark list of articles to test comprehensiveness is minimised.
Because English common names may represent common words or phrases, we will subject all keyword searches to sensitivity tests. Where the inclusion of common names increases the number of papers returned by a search by ≥ 100, we will restrict the search by adding “AND (control OR manag*)” for common names only.
We will review publications in English only. We will tag all publications that are excluded because they were not in English. We acknowledge that excluding literature written in non-English languages is a shortcoming. However, we do not have the resources needed to work in other languages, especially since checking of consistency requires dual-screening of articles.
Publication databases to be searched
Publications will be collated from the following databases:
Scopus and Web of Science searches will use the ‘title, keyword, and abstract function’ and will be performed using a University of Cambridge subscription. In addition to specialist bibliographic databases we will search one web-based search engine, Google Scholar (first 50 results sorted by relevance). Previous work has shown that this method is useful for finding both academic and grey literature .
We will also search specialist websites for relevant studies. The first of these, https://www.conservationevidence.com, contains systematically collated evidence on the effects of management interventions taken both from the academic and grey literature [18, 21]. This website has also already collated evidence on the control of three of the priority species that are the subject of this review (floating pennywort, American skunk cabbage, and Parrot’s feather) and so represents a useful resource for the planned systematic review. In addition, we will examine accounts for each species on the CABI Invasive Species Compendium website (https://www.cabi.org/isc/). and on the IUCN ‘EU regulation on invasive alien species’ website (https://bit.ly/2SJjfgG) to identify potential studies relating to management both from the academic and grey literature.
Additional search method
During screening of full texts, the reference lists of studies that meet inclusion criteria and any relevant secondary literature will be examined to identify other potentially useful studies, sometimes called ‘snowballing’ . The same process will be carried out using any relevant studies found in this way until no relevant studies are found in reference lists. This method may help to find literature that was missed in key word searches as well as ‘grey’ literature .
Article screening and study eligibility criteria
We will screen publications in two stages:  using titles and abstracts and  using full texts. At each stage we will decide whether to include or exclude a publication based on the eligibility criteria we set out below. At each stage we will record the number of publications included/excluded, and we will provide a list of the full texts that were excluded, together with reasons for their exclusion and a ROSES flow diagram . To ensure a lack of bias, members of the review team will not screen any articles that they have authored—these papers will be screened by other members of the team.
To check for consistency, a random sample of 10% of titles/abstracts will be screened, by two people, using our inclusion criteria. Any disagreements between the two people will be discussed, and eligibility criteria will be revised to show how disagreements were resolved. Kappa scores will be calculated to test the agreement between the two people . If Kappa scores are below 0.6, another 10% of titles/abstracts will be screened by the same two people. Disagreements will be discussed and resolved again, Kappa scores recalculated, and the consistency checking process repeated until Kappa scores are greater than 0.6.
Following this, a sample of 10% of the full texts of publications that meet inclusion criteria based on their titles/abstracts will be screened by two people. This process will mirror that detailed above for title/abstract screening.
In this systematic review we will use a “PICO” approach to determine eligibility criteria. To be included in our systematic review studies must attain the criteria detailed below.
We are interested in multiple populations in this systematic review. Firstly, we are interested in the following invasive species: Japanese knotweed (Fallopia japonica) Nuttall’s waterweed (Elodea nuttallii), Chilean rhubarb (Gunnera tinctoria), Giant hogweed (Heracleum mantegazzianum), Floating pennywort (Hydrocotyle ranunculoides), Himalayan balsam (Impatiens glandulifera), Curly waterweed (Lagarosiphon major), American skunk cabbage (Lysichiton americanus), Parrot’s feather (Myriophyllum aquaticum). In addition, we are also interested in the environments/ecosystems that have been impacted or may be impacted by these invasive alien species which may be affected by management interventions to control/prevent invasive alien species.
We are interested in all management interventions that attempt to exclude or control the invasive alien plant species detailed above. These interventions will be categorised using a provisional classification of practices that the review team have developed (see Additional file 2). We will adapt the categories in this classification scheme as we screen the literature and identify interventions that were not previously classified. The classification scheme is intended to be used as a means of classifying management interventions for invasive alien plant species that can be reused for any invasive plant species in the future. Previous general classifications for management interventions in conservation already exist  but these are defined at too low a resolution to be useful for the current systematic review.
We will include any study that compares an area or time in which a management intervention has been carried out with one where no management intervention has been undertaken. To aid synthesis we will label studies based on different elements of their design. We will include both experimental and correlative studies. Experimental studies will be defined as those in which treatment and comparator groups were defined prior to a management intervention being carried out. Correlative studies will be defined as those in which comparators were defined after a management intervention was carried out. An example of a correlative study is one that compares areas grazed by livestock to control an invasive plant with areas that were selected because they are similar to the grazed site but where no livestock grazing was used. This distinction is important because post hoc definition of comparators may result in biases in counterfactual comparisons.
Studies that compare an area where a management intervention has been used with an area where it has not been used will be labelled as controlled. This includes comparisons with areas where no management was used and with areas where alternative management interventions were used (e.g. a comparison of herbicide vs cutting). Studies where there is a comparison between a time when no management intervention was used and a time when the management intervention was used at a later time point will be labelled as a before-and-after study. These labels are not mutually exclusive—and so we will include studies that include both these design elements—commonly referred to as before-after-control-impact studies. Where experimental studies have used a blocked design, they will be labelled as such, and when correlative studies used paired sites, this will also be recorded. We will also include studies that have investigated different intensities of the same management intervention (e.g. comparison of different concentrations of herbicide).
Both replicated and unreplicated studies will be included in our systematic review, and we will label them as either replicated or not. We will also label studies as randomized or not. We will include studies carried out in natural habitats, under greenhouse conditions, or in laboratories. Further information on the methodological details of studies that we will record can be found in Table 2.
We will include studies of all environmental outcomes, which we will categorise based on a provisional classification of outcomes that we have developed (see Additional file 3). This includes outcomes relating to invasive plant species and native plant species, as well as on other biodiversity, crop production, soil, water, and pollutants. Note that we will not include outcomes that relate to the adoption of management interventions by practitioners (e.g. the number of practitioners using herbicide to control invasive plants), which we see as being beyond the scope of this work. In addition, where biological control has been used to control invasive alien plants, we will not extract data about the biocontrol agents themselves. At the full-text screening stage studies that do not present tables and figures relating to outcomes will be excluded.
Similarly to our classification scheme of management interventions, this classification scheme will be used as a means of categorising different outcomes used in studies that have tested management interventions. This classification scheme will be updated if we find additional relevant outcomes during screening and data extraction. This will allow us to develop a classification of outcomes that is reusable for other invasive plant species.
Study validity assessment
Critical appraisal strategy
We will critically appraise all studies that we consider to be suitable for quantitative synthesis to assess their validity. Doing this will allow us to identify studies that are likely to be more prone to biases that affect internal validity. We will use sensitivity analysis to further examine these biases.
Table 2 gives the criteria for study validity assessment. These criteria represent the variables that we consider to be critical in influencing the internal validity of study findings, which focus on the effects of selection bias and performance bias. We will not assess to what extent results from individual studies are generalisable (external validity) as this will vary depending on the context of the reader, given that generalisability is likely to vary geographically and taxonomically, as well as in other ways.
Data on criteria in Table 2 will be extracted from relevant studies. Any studies for which the answer is ‘no’ or ‘unclear’ to any of the questions will be assigned as having low validity; studies that are not assigned as having low validity will be assigned as having medium validity if any of the questions are answered as ‘partially’, the remainder of studies will be classified as having high validity. The study validity of a random sample of 10% of studies for each species will be determined by two reviewers. Any reasons for disagreement will be discussed.
Data on study validity will be presented alongside details extracted from the studies and descriptive statistics calculated to give an overview of study validity. Where possible, we will then use sensitivity analyses to examine how summary effect sizes are altered by exclusion of studies with particular characteristics. This will involve the exclusion of studies assessed as having low validity followed by exclusion of those with both low and medium validity.
Data coding and extraction strategy
Metadata extraction and coding
Metadata will be extracted from studies that meet our selection criteria. Metadata on the context and PICO elements that we will extract from each study are detailed in Table 3.
Data extraction strategy
We will extract the mean values of treatments (e.g. plots with management interventions) and controls (e.g. plots without management interventions or which used alternative management interventions). If available, we will also extract measures of variability around the mean (standard deviation, variance, standard error of the mean, or confidence intervals), number of replicates, and the P value of the comparison between treatments and controls (see Table 4 for a list of data that will be extracted). Comparisons will only be made within one figure/table but not between figures/tables. For example, if studies were done in two or more areas, but these results were presented in separate figures/tables, we will assume that a comparison cannot be made across tables unless specified otherwise in the article text. Where data are presented for multiple years, data from all years will be extracted. Where data is presented for multiple sites, data for all sites will also be extracted. We will not contact primary authors to request additional data. All extracted data will be made available as additional files.
Approaches to missing data
Missing data can present a serious issue for meta-analyses in ecology and conservation as reporting standards in the literature are very variable. This often includes missing information about sampling variance or sample sizes . Where possible, if information is missing for components required for effect size calculation and study weighting (i.e. mean values, sample sizes, measures of variability) we will calculate these values based on the available information. For example, P-values can be converted into t-test or F-statistics, which can subsequently be used to calculate effect sizes . To investigate the influence of using variance to weight analyses we will use sensitivity analysis to compare results for both weighted and unweighted meta-analyses.
Consistency of data extraction will be checked by randomly sampling 10% of those studies that meet inclusion criteria from which their data will be extracted by two people. Discrepancies in the data extracted will be discussed and extracted data modified accordingly. No Kappa analysis will be performed at this stage as our goal will be for complete agreement between reviewers.
Potential effect modifiers/reasons for heterogeneity
Potential reasons for heterogeneity will be explored using sub-group analysis, meta-regression, and sensitivity analysis (Table 5). Where relevant we will expand the list of potential modifiers during the synthesis process. This list was compiled following discussion within the review team.
Data synthesis and presentation
We will present quantitative data synthesis for intervention-outcome combinations that have ≥ 1 data points. Interactive evidence synthesis will also be provided on Metadataset.
Narrative synthesis strategy
We will include a narrative synthesis to give an overview of the studies used in the systematic review. This will include presentation of tables containing relevant information about the studies that meet our inclusion criteria and further discussion of studies, where necessary.
Quantitative synthesis strategy
The minimum amount of data we will need to calculate an effect size will be the treatment mean and control mean, which will allow us to calculate the response ratio . We will calculate the mean effect of each intervention on each outcome, using standard meta-analytic methods . If we have only the minimum amount of data for the effect of an intervention on an outcome (i.e. treatment means and control means), we will only be able to do an unweighted meta-analysis. Otherwise, we will do a weighted meta-analysis (e.g. weighted by the inverse of variance, but sensitivity analyses using other weights will also be available on the website, as will unweighted meta-analysis). We will always use random-effects meta-analysis, as study-level effects are likely to vary because of true variation, rather than solely experimental error, as assumed by fixed-effects meta-analysis . We will also explore heterogeneity using subgroup analysis and meta-regression, where appropriate. We will use standard methods for estimation of publication bias (e.g. fail-safe numbers and funnel plots) . Where relevant we will report the results of any sensitivity analyses.
Knowledge gap identification
We will identify knowledge gaps when the systematic review is completed. We will use our categorisation of both management interventions and outcomes to identify where there is a lack of knowledge for particular outcomes for each management intervention. If the systematic reviews are updated on the metadataset website, we will modify information on the website to identify where gaps remain.