Background

The need to quantify anti-predator responses

Mammals are experiencing an alarming rate of extinction [1,2,3] due to anthropogenic impacts such as habitat loss and fragmentation, illegal hunting, and exotic predators [4]. Redressing this loss of biodiversity requires well-informed and well-tested management interventions. Many of these interventions will need to be underpinned by a mechanistic understanding of species’ behaviour.

How an animal responds to predators has substantial bearing on its ability to survive. Predation, particularly from introduced predators, has been a major driver of mammal declines and extinctions around the world [5,6,7,8,9]. This is especially true for individuals and populations that have had limited or no exposure to predators, such as many island populations [10, 11], individuals raised in captivity and those moved to an environment with novel predators [12,13,14]. Improving our understanding of how animals behave in response to predatory stimuli should provide crucial insights for their conservation management and can improve our ability to retain antipredator traits in managed populations [12, 15, 16]. An animal’s response to predators may be behavioural (e.g. spatial and temporal avoidance [17, 18], avoiding detection [19] and evasion [20]) or physical responses (e.g. chemical [21] and physical defences [22]). Behavioural responses are likely to be more plastic and responsive at shorter time frames than physical responses, and are therefore particularly important when considering the acute impacts of predators on the persistence of predator-naïve species.

Common strategies employed to prevent faunal extinctions include captive breeding [23], translocations (the deliberate movement of animals from one population or site for release in another [24]) and establishment of populations in predator-free havens (areas isolated from predators through a geographical or physical barrier, such as islands or fenced enclosures [25,26,27]). Such approaches have secured a number of populations of mammals, including African elephants [28, 29], European lynx [30], elk [31], giant pandas [32], and Tasmanian devils [33]. Despite their initial successes, these strategies are at risk of longer term failure because they can expose naïve individuals to novel contexts for which they may lack appropriate behavioural responses. Further, such populations become vulnerable to acute population collapses from uncontrolled predator incursions.

Australia provides a compelling case study to illustrate the challenges of mammal conservation. More than one third of modern mammal extinctions have occurred in Australia, largely due to the introduction of feral cats and foxes [34]. In response, havens free of introduced predators are a key component of conserving much of the remaining mammal fauna [26, 27, 35]. Australia’s current network of havens provides habitats for at least 32 mammal species, and has secured at least 188 populations and sub-populations [26]. Evidence is emerging, however, that in the absence of feral and/or native predators, havened populations no longer exhibit anti-predator behaviours [13, 36,37,38,39,40]. This renders individuals in these populations fundamentally unfit for reintroduction back into areas where predators still persist. Because the success of many translocations has ultimately been compromised by predation [35, 41, 42], the future of mammal conservation in Australia, and more broadly, hinges on developing methods and strategies that can quantify and conserve antipredator behaviours in havened and translocated populations [39].

To undertake an adaptive management approach, we require monitoring and evaluation of anti-predator responses in mammalian species. Despite awareness that behavioural traits such as boldness or shyness can influence conservation outcomes, measuring such traits is rarely incorporated into monitoring and management [16, 43]. Anti-predator responses have only recently been identified as a potential barrier to the success of conservation projects [13, 37,38,39], and while an array of academic literature exists that details various methods for measuring these behaviours [15, 38, 39, 44,45,46,47,48], accessing the methodologies, comparing them for rigor and identifying the most appropriate measure is labour intensive. Stakeholders, such as conservation and population managers, are likely to be seeking this information, but also likely to be limited by the time and resources necessary to find it. Ultimately, we currently lack a robust framework for the universal monitoring and evaluation of anti-predator traits [49]. The first step to developing such a framework is to understand which behavioural assays have been conducted, which are most effective (capture or provoke the greatest behavioural response), and whether the type of predator cue is important. In the absence of this crucial information, the adoption of inappropriate and poorly-performing behavioural metrics may prevail.

Identification and engagement of stakeholders

In addition to the review team, stakeholders relevant to this review have been identified as those who research or manage animal populations, for example, members of species recovery teams (Fig. 1). To ensure the information collected throughout this review is tailored toward the target audience, and thus of the most relevance for application, a variety of stakeholders from each of the categories in Fig. 1 were consulted during the development of this protocol. We invited 27 stakeholders to comment on the draft protocol, and after receiving 16 replies (ten from Australia and six from other countries), we incorporated their suggestions.

Fig. 1
figure 1

End-user stakeholder groups (right-hand boxes) consulted when designing a systematic review of methods that quantify anti-predator behaviour in mammals. Arrows indicate each groups’ broad interests in the various steps (left-hand boxes) required for improving conservation outcomes

Objective of the review

We will present all known behavioural assays for measuring or quantifying anti-predator responses in mammals by collating information into an accessible format. Specifically, we will: (1) reveal different methods, (2) describe the context within which each method was conducted, and (3) highlight methods or aspects that warrant further examination, thus guiding the future development of behavioural assays. Further, using a modelling approach, we will then identify which types of behavioural assays and predator cues elicit the greatest responses in mammals (difference in effect size between the treatment and control conditions). A formal evidence synthesis is required to explore all potential methods and to avoid bias toward those published in academic journals, because much information may come from governmental reports and species recovery plans [16, 50]. The final review will act as a guide: it will highlight existing methodologies and provide additional information to assess their relevance, allowing stakeholders to easily select the most appropriate and effective behavioural assay for their purpose.

Using the PICO (Population—Intervention—Comparator—Outcome) framework [51], we have broken our review into two questions that will define our search scope. We will first systematically map all known methodologies answering a primary question: what behavioural assays have been used to quantify anti-predator responses in mammals? The elements of this question are:

Population

Free-living, wild-caught, or captive mammals (global).

Intervention

  1. (i)

    A behavioural assay that quantifies anti-predator responses to predator exposure

  2. (ii)

    A behavioural assay that quantifies anti-predator responses to predator cues

Articles that conform to both the Population and Intervention criteria will be used to answer this primary question. A secondary question we seek to answer will be assessed quantitatively by modelling the metadata collected from each article, asking: which assay-types and predator cues elicit the greatest behavioural responses? This question utilises the same Population and Intervention criteria as the primary question, but requires further assessment using Comparator and Outcome criteria to select studies for the systematic review. The additional elements of the secondary question are:

Comparator

Comparison between levels of predator exposure (e.g. before versus after exposure, exposure versus no exposure) or comparison between exposure to a predator cue versus a control.

Outcome

Difference in the behavioural response between the treatment (e.g. predator/predator cue exposure) and control conditions. Metrics of responses will differ between studies depending on assay type and will be compared using standardised effect sizes.

Articles that involve at least one Comparator element can then additionally be considered for the systematic review to investigate which Intervention elements (behavioural assays and predator cues) produce the greatest Outcome. The PICO elements of our two questions are illustrated in Fig. 2.

Fig. 2
figure 2

Elements of target questions illustrated using the PICO framework

Methods

Searching for articles

Scoping

To develop a search strategy, an initial scoping exercise was conducted using a test-list of 10 benchmark articles that assess anti-predator responses (Additional file 1), each selected as they cover a variety of different assays and predator scenarios. The titles, key words, and abstracts of each scoping article were mined, both manually, and using word clouds (R package wordcloud [52]; in the R environment [53]), to determine the most appropriate search terms [54]. An initial search string was then created using Boolean operators to combine the relevant terms based on the review team’s knowledge, and the terms identified from the scoping articles. Trial searches were conducted using the Web of Science: Core Collection. We systematically removed terms that appeared to broaden the search outside the scope of the review. To ensure the proposed strategy adequately returned relevant literature, the search output was scanned for relevant articles and each of the scoping benchmark articles. Unreturned articles were then closely inspected, and the search strategy was adjusted until it retrieved all 10 benchmark articles [51]. The comprehensiveness of the search strategy was then tested using a list of 5 independent articles (Additional file 1), all of which were retrieved by the final search strategy.

Search strategy

To begin collating articles for this review, bibliographic databases will be searched using the following search string (which will be modified for each specific database language).

TS = ((("antipredator response$" OR "anti-predator response$" OR "antipredator behavio$r" OR "anti-predator behavio$r" OR "escape behavio$r" OR "giving$up density" OR "FID" OR "GUD" OR "flight initiation distance") AND ("predator exposure" OR " prey naivete" OR "naive prey" OR "los$" OR "trait" OR "predator avoid*")) OR (("predator recognition" OR "predator exposure" OR "predation risk" OR "introduced predator$" OR "novel predator$" OR "predator odour") AND ("naive prey" OR "prey naivete" or "escape behavio$r" OR "giving$up density" OR "flight initiation distance" OR "FID" OR "GUD" OR "predator odour")) OR (("antipredator response$" OR "anti-predator response$" OR "antipredator behavio$r"OR "anti-predator behavio$r" OR "escape behavio$r") AND ("predator recognition" OR "predator exposure" OR "introduced predator$" OR "novel predator$"))).

Academic literature

Based on the subject matter covered by each, we will search the following bibliographic databases from which to collect peer-reviewed journal articles: Web of Science (Core Collection, BIOSIS Citation Index, Zoological Record, CAB abstracts) and Scopus.

Grey literature

To reduce bias toward published literature, we aim to also search a variety of grey literature sources [49, 50]. Using our search string above, we will collate theses and dissertations from two bibliographic databases specific to grey literature: Proquest Dissertation and EThOS: UK Theses and Dissertations. Conference proceedings will be searched in the Web of Science database using the predetermined search strategy. The following website will also be searched, using the search terms “anti-predator” and “antipredator”: opengrey.eu; trove.nla.gov.au. Specialist documents will be searched from within the following repositories, using the search terms “anti-predator” and “antipredator”: IUCN general publications (https://portals.iucn.org/library/dir/publications-list); IUCN Conservation Planning Specialist Group (http://www.cpsg.org/document-repository); Conservation Evidence (http://www.ConservationEvidence.com); WWF (https://www.worldwildlife.org/publications). A web-based search engine, Google (www.google.com), will be searched to supplement our search results. The first 50 links returned using each combination of the search terms “anti-predator/antipredator” and “behaviour/behavior”, will be inspected and added to the article pool if not yet identified [55].

Additional literature

Based on the knowledge of the review team and stakeholders, additional publications not identified by the search strategy may also be included.

Search results will be limited to articles written and published in English (due to the language capabilities of the review team). All database and grey-literature searches will be documented, and this information will be made available with the final review publication. All searches will be conducted within two years of the final analysis being submitted for publication.

Article screening and study eligibility criteria

Duplicate articles will be removed, and article screening will be conducted through CADIMA [51, 56]. To remove bias, two screeners will independently review articles at title and abstract level simultaneously to determine relevance, followed by the full text versions, to decide which meet the inclusion criteria. Each screener will assess an overlap of 10% of all articles (to a maximum of 50 articles screened) at both the title/abstract stage, and at the full text stage. Reliability between screeners will be assessed using Kappa calculations (with values > 0.5 deemed acceptable [12, 57]). In instances where screeners do not agree on the inclusion/exclusion of an article, they will discuss, and then consult a third member of the review team if necessary. If theses or dissertations have additionally been published as journal articles or specialist reports, we will assess the methods described in both, and only include the article that provides the most detail. While not anticipated, if reviewers find themselves assessing their own work, a third impartial member of the review team will oversee the assessment of any conflicting articles. A full list of excluded articles will be made available with the final review, detailing reasoning for their exclusion.

Each article will be screened against eligibility criteria based on the PICO framework as outlined in Table 1. The screeners will first review each article by title and abstract simultaneously, to assess the satisfaction of the eligibility criteria (Table 1).

Table 1 Study eligibility criteria based on PICO (Population—Intervention—Comparator—Outcome) framework

Articles that satisfy the Population and Intervention eligibility criteria will be used to pursue the primary question, and will then additionally be assessed against the Comparator and Outcome eligibility criteria for inclusion in the secondary quantitative component where they may address the effectiveness of the Intervention elements; either assay types or predator cue types. All articles considered for this analysis must have incorporated at least one of the Comparator elements and all of the Outcome elements listed in Table 1. In articles with more than one predator cue or population type (e.g. current, historic and control predator cues or exposure > 5 years ago, in the last five years and never exposed), we will extract the effect size (difference between the treatment condition and the control) of the cue or population that was hypothesized by the authors to elicit the largest response (thus limiting the number of data entries from each article to one per assay).

Study validity assessment

Studies that satisfy the Population and Intervention criteria but not the Comparator and Outcome criteria will not be critically appraised and will exclusively be used in the narrative synthesis identifying different methodologies for quantifying anti-predator responses. Those studies that do satisfy the four Population, Intervention, Comparator and Outcome eligibility criteria will undergo further critical appraisal using the CEE critical appraisal tool (Additional file 2, [59]). Critical appraisal will be undertaken by two members of the review team, and each appraiser will assess on overlap of 5% of studies (to a maximum of 20) to ensure consistency. If appraisers reach different conclusions around any study, the validity criteria will be refined, and consistency checking will be repeated.

Data coding and extraction strategy

Once screened, the following meta-data variables will be extracted or scored where possible:

  • Species

    • Common name

    • Common name

    • Latin name

    • IUCN conservation status

    • Size (small < 5 kg, medium 5–20 kg, large > 20 kg)

  • Assay

    • Assay type (e.g. flight initiation distance, trap behaviour, giving-up density)

    • Behaviour measured (e.g. avoidance, docility, exploratory behaviour, fear)

    • What equipment is required (e.g. camera traps, specialist equipment)

    • What equipment is required (e.g. camera traps, specialist equipment)

  • Type of predator exposure

    • Comparison between populations with varying exposure to predators (yes/no)

    • Use of predator cue (yes/no)

      • Direct or contextual

      • Olfactory, visual, or acoustic

      • Type of cue (e.g. faeces, urine, call, taxidermied model)

    • Cue properties

      • Did the cue move?

      • Size of cue (small < 5 kg, medium 5–20 kg, large > 20 kg)

      • Type of predator (e.g. terrestrial or aerial)

  • Robustness of methods

    • Sample size

      • Number of individuals

      • Number of populations (treatment groups)

      • Number of repeat measures per individual

      • Number of repeat measures per population

    • Measure of repeatability

      • Within individuals

      • Within populations

    • Was there a control treatment (exposure or cue)

    • If/how the methods were validated (e.g. fate of individuals, success criteria)

  • Effect size (difference in means between treatment and control group)

    • Mean response (and standard deviation) of treatment group

    • Sample size of treatment group

    • Mean response (and standard deviation) of control group

    • Sample size of control group

For the quantitative component, we will extract the mean response of each treatment, its corresponding variance (standard deviation, standard error or variance), and the sample size for each treatment. In articles where this information is presented graphically, we will calculate the measures from the figures (with the axes as scale bars) using the software Image J [60]. Metadata will be scored using a customised data sheet (Additional file 3; adapted from [61]) by two members of the review team. Each member will crosscheck 5% of articles (to a maximum of 20) to ensure consistency, and if differences are found in the extracted information, the meta-data protocol will be refined and cross check will begin again until all data extracted is consistent. Where any information is unclear or missing, authors will be contacted. After contacting authors, if the treatment/control standard deviations or sample sizes are absent, or if more than 50% of metadata are still missing, the article will be excluded from the quantitative review component. Extracted data will be made available with the full review as supplementary material.

Potential effect modifiers/reasons for heterogeneity

The following additional factors to be investigated by the review were compiled using the expertise of the review team, incorporating suggestions from stakeholders. We may unintentionally exclude some useful data by only searching articles written in the English language. There may be a bias in the types of animals for which measures have been developed, for example, threatened or charismatic species. The type of predator cue used may substantially affect the outcome, as less effective cues may not be representative of an individuals’ response to a true predation event [62,63,64,65]. For the most robust quantification of behaviour, methodology should use repeat measures, incorporate measures of repeatability, and validate the assays, for example, by quantifying the fitness outcomes of various behavioural responses [66, 67]. With such a systematic review, we hope to highlight where biases may be occurring, and reveal areas where more robust methodology is needed to guide the development of behavioural assays.

Data synthesis and presentation

The results from this systematic review will be presented both in a narrative synthesis (to address the primary question) and with a quantitative analysis (to address the secondary question) [51]. To answer the first question, what behavioural assays have been used to quantify anti-predator responses in mammals, each article and the associated meta-data will be detailed in a table of findings that will divide studies up based on the different assay-types. Specific examples of different methods will be discussed in further detail within the text of the review. Some descriptive statistics based on the meta-data will be used to reveal patterns such as species tested. We will discuss techniques that are used regularly and aspects of existing methodology that have been well developed and tested. For example, we will quantify the number of replicates per study, reveal the proportion of studies that incorporated measures of repeatability, and assess how existing methods have been validated (and describe the mechanisms used). We will also discuss features that are lacking from existing methodology, or characteristics that are poorly represented (e.g. specific taxonomic groups). There will be a section that features suggestions for future development of behavioural assays.

The secondary question, which assay-types and predator cues elicit the greatest behavioural response, will be answered based on the meta-data extracted surrounding the experimental design of each study. Using the treatment means, standard deviations and sample size extracted from each study, we will calculate a standardized measure of effect size for differences between means using Hedges’ g [58]:

$$g=\frac{{\mu }_{t - }{\mu }_{c }}{{s}_{p}}$$

where \({\mu }_{t}\) is the mean of the treatment group, \({\mu }_{c}\) is the mean of the control group and \({s}_{p}\) is the pooled standard deviation. The formula for pooled standard deviation is:

$${s}_{p}=\sqrt{\frac{\left({n}_{t}-1\right){s}_{t}^{2}+\left({n}_{c}-1\right){s}_{c}^{2}}{\left({n}_{t}-1\right)+ \left({n}_{c}-1\right)}}$$

where \({n}_{t}\) and \({s}_{t}\) are the number of observations and standard deviation for the treatment group respectively, and \({n}_{c}\) and \({s}_{c}\) are the number of observations and standard deviation for the control group respectively. Hedges’ g was chosen over other effect size measures such as Cohen’s d, as it is suited to a range of sample sizes and because it facilitates comparisons across studies by weighting each measure based on the number of observations [68]. We will build two mixed effects models using R [53] to identify which predator cue types and behavioural assay types elicit the greatest difference in effect size (Hedges’ g), while controlling for potential confounding factors where possible. We will include each study’s unique identifier as a random effect in both models to account for the non-independence of multiple effect sizes from each study. The protocol for this review adheres to the ROSES guidelines (see Additional file 4 for checklist).