Introduction

Conservation translocations are increasingly used to restore or reinforce decreasing or locally extirpated populations (Seddon et al. 2007). However, such active interventions can entail risks, including potentially unwanted impacts on the recipient ecosystem (IUCN/SSC 2013). This concern is especially strong for conservation introductions (also known as assisted colonisation), the movement of species beyond their indigenous range, which are often proposed as a solution when ecosystems have been altered to the point where they are no longer habitable for their native species (Hoegh-Guldberg et al. 2008; Seddon 2010). In the face of such risks, translocation decisions require a balance of interests where costs and benefits must be carefully weighted (Canessa et al. 2016a, b, c). The stakes can be exceptionally high, with possible extinction on the one hand and damaging biological invasion risks on the other.

Careful risk assessments are a fundamental requirement of any such discussion (Mueller and Hellmann 2008), but they require knowledge about historical ranges and proposed release sites, including interactions among multiple species, the environment, and humans (Roy et al. 2020). The highly endangered species typically considered for conservation introduction are usually rare, and their ecology may be poorly known, even in their original range. Processes of decline, persistence at low numbers, and introduction outside historical range can, in themselves, induce changes in behaviour and ecology (Wilson et al. 2020); additional research to clarify these uncertainties is often limited by sample sizes and perceived risks (Martin et al. 2012; Tulloch et al. 2015; Canessa et al. 2016a, b, c). Where empirical evidence is limited, risk assessments often require some level of expert judgment.

Where translocation decisions involve concerns of invasion risks, assessment should use insights and available methods from invasion ecology. For example, the Environmental Impact Classification for Alien Taxa (EICAT; Blackburn et al. 2014) has been adopted by the IUCN as a global standard to classify the impacts of species established beyond their range (IUCN 2020). This is done by identifying the mechanisms through which an invasive species impacts an ecosystem and by classifying the severity of impact based on five levels ranging from “Minimal” (negligible impact on native species) to “Massive” (irreversible impact resulting in the replacement or extinction of at least one native species) (Hawkins et al. 2015; Kumschick et al. 2020; Volery et al. 2020). The EICAT typically focuses on realised impacts by invasive species, it provides a standard classification system that might be extended to assessment of prospective impacts, including for conservation translocations of threatened species.

Shifting the focus from recorded to prospective impacts is likely to require some level of extrapolation. Although ideally evidence-based (Kesner and Kumschick 2018; Henry and Sorte 2022), EICAT assessments often complement empirical knowledge with expert opinion (Dehnen-Schmutz et al. 2022). A wide body of literature exists on how to carry out elicitation to minimise individual and group biases (Burgman et al. 2011; Martin et al. 2012; Sutherland and Burgman 2015). These methods have been used effectively in combination with EICAT before (Turbé et al. 2017; Vanderhoeven et al. 2017). However, assessments that use categorical ratings, like the impact levels in EICAT, remain challenging. Linguistic uncertainty, where people interpret different words in different ways (Wintle et al. 2019), is reduced by providing a standard set of definitions for all levels. However, even after definitions are clarified, experts may disagree in their judgments and differ in their levels of certainty. In these cases, seeking consensus by discussion may facilitate groupthink bias and reward overconfidence (Kuhnert et al. 2010).

Alternatively, variation across experts can be accounted for by aggregating judgments, although this is not straightforward for verbal expressions. For example, it is difficult to define an “average” rating across a group of experts where judgments range from Minimal to Massive. It might be tempting to convert categories to numerical scores to aggregate them mathematically, for example by multiplying or averaging them (Evans 2018; Sohrabi et al. 2021), but such aggregation is often meaningless or misleading (Game et al. 2013; Canessa et al. 2021). Moreover, uncertainty is not limited to disagreements among experts but includes also different levels of confidence by individuals. This uncertainty, reflecting lack of knowledge or simply the challenge of predicting future events, must be accounted for in any analysis, both when considering individual estimates and when aggregating across groups. Previous studies have approached uncertainty in EICAT classification by including and comparing multiple experts and their confidence, but they still rely on verbal definitions and arbitrary categorisations (Probert et al. 2020; Clarke et al. 2021). There are several methods for more formal quantitative expert aggregation, for example using multinomial-Dirichlet conjugate priors to express probabilities of categorical events (Wilson et al. 2021), that would be useful in the assessment of invasion risks including from conservation translocations.

In this study, we combined EICAT impact definitions with formal expert elicitation and quantitative aggregation of expert judgments to carry out an assessment of translocation risks, maximising transparency and incorporating uncertainty within and across multiple experts. We used this process in the practical case study of an extinct-in-the-wild bird species being considered for a conservation introduction beyond its known historical range.

Methods

Risk context

The sihek (Guam kingfisher, Todiramphus cinnamominus) is a medium-sized endangered bird species endemic to the island of Guam in the North-Western Pacific (Jenkins 1983). Historically, they inhabited limestone and ravine forests, coconut groves, and strand vegetation and primarily fed on lizards and large insects, but were also observed to occasionally predate crabs, segmented worms, fish, and small birds (Beck et al. 1990 and references therein). Sihek are territorial and lay one to three eggs by excavating cavities in softwood during the nesting season (December to July) (Beck et al. 1990 and references therein).

Sihek were extirpated from Guam following the accidental introduction of the invasive brown tree snake (Boiga irregularis), likely through military activities (Savidge 1986; Engbring and Fritts 1988). Prior to extirpation, a conservation breeding population was established, which persists to date but is descended from only 16 individuals. The species is listed as Extinct in the Wild by the IUCN (BirdLife International 2017). Returning sihek to the wild has been recommended in part due to increasing genetic concerns arising from continued small population size and subsequent inbreeding (Trask et al. 2021), as well as potential adaptation to captive conditions. However, the continued presence of brown tree snakes in Guam remains a major challenge to reintroducing sihek in the immediate future. The establishment of a wild population outside the indigenous range could allow sihek to breed in a more natural environment. This, in turn, would reduce potential for adaptation to captive conditions and increase the global population size, thereby slowing the rate of increase in inbreeding and loss of genetic diversity (Trask et al. 2021), and providing valuable information about their behaviour and vital rates in the wild to inform future reintroduction efforts on Guam. Prior to any such action, the risks posed by sihek to recipient ecosystems should be thoroughly assessed; however, available information about sihek natural history is mainly limited to anecdotal descriptions and expert knowledge.

In 2019, we began evaluating potential release sites for sihek based on ecological suitability and logistical feasibility. From a prior longlist of sites potentially suitable for sihek throughout the Pacific Islands based on ecological criteria (Laws and Kesler 2012), a sihek working group identified five candidate release sites where sihek release would in principle be logistically feasible: Kosrae, Chuuk, and Yap in the Federated States of Micronesia; Palmyra Atoll in the Northern Line Islands; and Tinian as a representative island of the Commonwealth of the Northern Mariana Islands (Fig. 1) (Trask et al. 2019). These sites were selected for initial assessment to inform further engagement with local authorities and communities. To assess the risks posed by sihek introduction to these sites, we combined the EICAT framework with formal expert elicitation as described below.

Fig. 1
figure 1

Location of Guam and the five candidate introduction sites selected for this impact assessment study: Kosrae, Chuuk, and Yap in the Federated States of Micronesia, Palmyra Atoll in the Northern Line Islands, and Tinian in the Commonwealth of the Northern Mariana Islands (CNMI)

EICAT-based risk assessment

To assess the risk of environmental impacts posed by sihek introduction to the candidate sites, we based our approach on the Environmental Impact Classification of Alien Taxa (EICAT; Blackburn et al. 2014), a well-developed framework widely used to classify the impacts of alien species based on twelve ecological impact mechanisms recognised by the IUCN SSC Invasive Species Specialist Group (IUCN 2020). The EICAT requires assessors to identify the relevant impact mechanisms and assign to them one of five impact levels: Minimal, Minor, Moderate, Major, and Massive (IUCN 2020; Table 1). Based on ecological knowledge of sihek and similar species, we were able to confidently exclude several impact mechanisms (e.g., chemical, or physical alteration of habitat). Out of the twelve impact mechanisms described in the framework of the EICAT, we therefore concentrated this assessment only on competition, predation, disease, and hybridisation. Furthermore, because sihek are endemic to Guam and their extirpation is recent, their return would not represent a conservation introduction; therefore, we did not include Guam in the assessment for competition, predation, and hybridization impacts. We did, however, include Guam in the disease assessment, because birds brought from institutions in North America might accidentally involve release of pathogens novel to Guam.

Table 1 Top section: sihek-specific Description and Outcome for each impact mechanism as provided to experts. Bottom section: impact levels as defined in the EICAT (IUCN 2020), used by experts to quantify outcomes for each mechanism

Expert elicitation

To formally elicit expert judgment within the EICAT framework, we based our assessment on the IDEA protocol (“Investigate,” “Discuss,” “Estimate”, and “Aggregate”) for elicitation (Hemming et al. 2018). We selected a total of 40 experts based on their knowledge about at least one of the following areas: the focal species, invasive species, one or more impact mechanisms, and source and release sites, and invited them by individual email to complete an assessment for the specific impact mechanism(s) relevant to their expertise. In total, 21 experts participated and completed 25 assessments (5–8 experts per mechanism; Table 2). Some experts contributed to evaluating more than one mechanism, and two collaborated on a joint assessment of the mechanisms predation and competition. The final expert cohort of respondents was composed of fifteen men and five women from Italy, New Zealand, the United Kingdom, and the USA, aged 20 to 60 + , with a majority between 40 and 60 years old. Assessors were in majority research biologists except for disease impacts, where most contributors were wildlife veterinarians.

Table 2 Summary of the information provided to experts for their assessment. Top: Summary information about sihek or sihek-related species (drawn from the literature on sihek, related species, and the release sites). Bottom: Summary list of species potentially relevant to the assessment of each impact mechanism. Numbers in parentheses indicate the number of assessments and of contributing experts

Experts contributing to assessments of competition and predation impacts had expertise in Pacific birds, as well as potential competitor or prey taxa, respectively. Contributors assessing hybridisation impacts had expertise in conservation genetics, hybridisation, and speciation. Invasive species experts also contributed to the assessment of these mechanisms. For the assessment of disease impacts, contributors had expertise in wildlife health, with most experts also familiar with the IUCN wildlife disease risk analysis process. Experts were also provided with scientific literature relevant to their assessment. This comprised of a list of species that could potentially interact with sihek at each site through the mechanism assessed, as well as background information on sihek or related species and the potential release sites (summarised in Table 2), and description of the impact mechanisms and levels (summarised in Table 1). The material provided to experts in its entirety (full list of species and complete background information) is available in the Supplementary Material (Appendix 1).

To avoid the linguistic uncertainty associated with verbal definitions of risk, avoid arbitrary categorisation, and facilitate comparison and aggregation, we asked each expert to estimate the probability that a given impact would reach each of the five levels at a given release site (summing to 1 for each site). Experts were asked to provide a written justification for their impact estimates (Supplementary Material, Appendix 3). We also asked experts to estimate confidence in their knowledge, ranging from 0% (no confidence) to 100% (total confidence). We repeated the elicitation for all impact mechanisms across all sites. All elicited estimates, in anonymous format, can be found in Supplementary Material.

Aggregation of estimates and uncertainty

After obtaining quantitative estimates from each expert about the probability of different impact levels for each mechanism, we sought to summarise that information quantitatively, incorporating both the dispersion across the group and the confidence of individual experts. We developed, used and compared three methods for aggregation that provided different but complementary ways to account for this uncertainty.

First, we aggregated the expert estimates without considering their expressed confidence (hereafter, we refer to this as “basic method”). For each expert, we defined a Dirichlet distribution, using the expressed probabilities of different impact levels as shape parameters (intuitively corresponding to “votes” for a given impact level: each expert allocated their votes according to their beliefs), and sampled that distribution 100 times. We then aggregated those votes into a single distribution (with a length equal to 100 times the number of experts for the given impact mechanism) and assessed it using summary statistics (mean, median and standard deviation) and plots.

Second, we repeated the same process but accounted for uncertainty by weighting experts proportionally to their expressed confidence (hereafter, “bootstrapping method”). For each expert, the shape parameters of the Dirichlet distribution were proportional to that expert’s expressed confidence:

$$v_{e} = \frac{{c_{e} E V }}{{\mathop \sum \nolimits_{e = 1}^{N} c_{e} }}$$

where ce is the confidence expressed by expert e, E is the total number of experts, and V is the number of votes per expert in the basic method. For example, for an assessment with four experts who had expressed uncertainties c of 100, 80, 60 and 40, their proportional share of 400 votes (V = 100 each in the basic method, times four experts) would be 0.36, 0.29, 0.21 and 0.14, corresponding to ve of 143, 114, 86 and 57 votes respectively. Again, we aggregated the draws into a single distribution and assessed it using summary statistics and plots.

Third, we used a Bayesian approach to estimate a posterior distribution for the mean probabilities across the expert group, accounting for uncertainty. For each expert, we created deterministically a vector of votes of length equal to their expressed confidence, subdivided proportionally to the expressed probabilities for each impact level. We then bundled all expert vectors together and used a Bayesian sampler to estimate the underlying probabilities, using an uninformative Dirichlet prior (conjugate to the categorical distribution and a common choice for such processes). We drew N samples from the posterior distribution, where N was calculated to provide the same total number of draws as obtained for the basic and bootstrap aggregation after a burn-in of 2000 (N = V E + burn-in), over three Markov chains with a thinning rate of 3. We assessed chain convergence by visual inspection and the R-hat statistic (Brooks and Gelman 1998).

In summary, the basic method represents uncertainty across the group but ignores uncertainty in individual judgments; the bootstrapping method includes both group and individual uncertainty by weighting; and the Bayesian method uses the weighted estimates to derive a hypothetical underlying posterior distribution (“consensus” group mean) for each impact probability. We consider none of the methods to be inherently superior to the others, but to represent different approaches to treating uncertainty (ignoring it, fully accepting it, and seeking consensus) that should be evaluated jointly.

All analyses described above were conducted using R (R Core Team 2021) and JAGS (Plummer 2003), and R packages readxl (Wickham et al. 2023), tidyverse (Wickham et al. 2019), rBeta2009 (Cheng et al. 2014), jagsUI (Plummer 2003), ggplot2 (Wickham 2011), and gridExtra (Auguie and Antonov 2017). Reproducible code can be accessed from GitHub (https://github.com/maude-v/invasion_risk_EICAT).

Results

As expected, the mean estimates from bootstrapping and Bayesian methods were equal within two decimal digits. Unless otherwise stated, we report the latter here for consistency. Detailed estimates for all three methods (basic, bootstrapping and Bayesian) can be found in Supplementary Material, Appendix 2, Tables S1S2.

For competition, there was general agreement across most experts that Major or Massive impacts were unlikely. The mean estimates for these two levels ranged from 0 to 0.2 in all sites, with the exception of Tinian, where experts did not rule out a higher probability of Major or Massive impacts, albeit with some disagreements (Fig. 2). Palmyra Atoll was considered the lowest-risk site for this impact mechanism, with Minimal impacts receiving a mean probability greater than 0.8 regardless of the aggregation method used (Fig. 2; Tables S1S2). For Chuuk, Kosrae and Yap, estimates were broadly distributed across Minimal to Minor impacts, with greater uncertainty (wider confidence intervals), reflecting divergences of opinion among experts (Fig. 2; Table S2). The presence of the Mariana kingfisher Todiramphus albicilla, congeneric to sihek, and several other bird species also led some experts to increase their estimates for competition impacts on Tinian. Uncertainty around this mechanism was also high, because candidate sites host many insufficiently surveyed diurnal skink and gecko species with poorly known systematics and unknown densities.

Fig. 2
figure 2

Aggregation of expert estimates for competition impacts for all sites. Panel (Basic) represents the basic method (no consideration of expert confidence), panel (Bootstrapping) the bootstrapping method (experts weighted by confidence), Panel (bayesian) the Bayesian method (posterior distribution of group estimates). In each panel, black dots indicate means, black lines indicate the 25th and 75th percentiles, violin plots indicate the probability distributions by impact level, increasing in severity from left to right

Overall, predation was considered the most severe potential mechanism of impact: for all sites except Palmyra Atoll, the probability of Massive impacts (non-reversible loss of at least one species, Table 1) was higher than 0.15. Sihek have a relatively wide range of possible prey (Beck et al. 1990). Probabilities were estimated more evenly across levels and sites than for other impact mechanisms (mean ~ 0.2, Table S1). Estimates of Minor and Moderate impacts showed a relatively clear binomial split in the group, with the probability of these impacts being expected around 0.15 or 0.5 by different experts (Fig. 3). Across the group, Moderate impacts were considered the most probable for all sites; the probability of Massive impacts was lowest at Palmyra Atoll (0.03; Table S1), both in terms of the mean estimate across the group and of the estimate by the most pessimistic expert. All other islands were rated similarly, with Yap having the highest probability of Massive impacts (0.22; Table S1). Experts expected released sihek to feed opportunistically on many taxa, possibly reducing the risk of substantial impacts on any one species. However, some experts reasoned that the sihek’s hunting habits and known historical prey preferences could indicate a particular risk to small passerines and diurnal lizard species. In particular, Yap and Kosrae host endemic passerine species (the Kosrae white-eye Zosterops cinereus, the Yap olive white-eye Z. oleaginus and the plain white-eye Z. hipolais), which experts highlighted as potential sihek prey species.

Fig. 3
figure 3

Aggregation of expert estimates for predation impacts for all sites. Panel (Basic) represents the basic method (no consideration of expert confidence), panel (Bootstrapping) the bootstrapping method (experts weighted by confidence), Panel (Bayesian) the Bayesian method (posterior distribution of group estimates). In each panel, black dots indicate means, black lines indicate the 25th and 75th percentiles, violin plots indicate the probability distributions by impact level, increasing in severity from left to right

For hybridisation, substantial impacts were expected only on Tinian, where Minimal and Minor impacts were considered most likely (0.41 and 0.35, respectively; Fig. 4, Table S1), but Moderate and even Major impacts could not be ruled out (0.17 and 0.05, respectively; Fig. 4). For all other sites, the most likely impacts through hybridisation were generally considered Minimal or Minor (0.93, and 0.06 to 0.07, respectively; Fig. 4, Table S1), with high confidence that impacts of greater magnitude were unlikely (0.00 mean and third quartile; Table S1). Higher estimates for Tinian reflected the experts’ concern that sihek introduction to northern Marianas islands might have deleterious effects on the endemic Mariana kingfisher. Although reproductive isolation between sihek and Mariana kingfisher was considered likely (Andersen et al. 2015), a risk of impact through hybridisation could not be ruled out with certainty.

Fig. 4
figure 4

Aggregation of expert estimates for hybridisation impacts for all sites. Panel (Basic) represents the basic method (no consideration of expert confidence), panel (Bootstrapping) the bootstrapping method (experts weighted by confidence), Panel (Bayesian) the Bayesian method (posterior distribution of group estimates). In each panel, black dots indicate means, black lines indicate the 25th and 75th percentiles, violin plots indicate the probability distributions by impact level, increasing in severity from left to right

Disease impacts through sihek introduction were also prominent in the judgment of experts, with probabilities evenly spread between Minimal and Major impacts and the lowest disagreement among experts of all mechanisms (Fig. 5). The probability of Minor to Moderate impacts was estimated around 0.3 at all sites (Table S1). Tinian had the highest probability of Major and Massive impacts (0.19 and 0.09, respectively; Fig. 5, Table S1), whereas Palmyra Atoll had the highest probability of Minimal impacts and the lowest probability of Major to Massive impacts (0.04; Fig. 5, Table S1). While there were slight disagreements among experts about the estimated probabilities of Minor to Moderate impacts, there was general agreement about the probability of Major impacts at all sites, estimated around or above 0.1 except for Palmyra Atoll. At Palmyra Atoll, experts judged the risks of disease impacts to be generally lower (Fig. 5), particularly because this site does not host local passerines, but they could not rule out impacts on other taxa.

Fig. 5
figure 5

Aggregation of expert estimates for disease impacts for all sites. Panel (Basic) represents the basic method (no consideration of expert confidence), panel (Bootstrapping) the bootstrapping method (experts weighted by confidence), Panel (Bayesian) the Bayesian method (posterior distribution of group estimates). In each panel, black dots indicate means, black lines indicate the 25th and 75th percentiles, violin plots indicate the probability distributions by impact level, increasing in severity from left to right

When we compared estimates across the three methods, we found only minor differences between the probability distributions of expected impacts under the basic and bootstrapping methods, suggesting uncertainty by experts (and conversely their self-assessed confidence) did not significantly change results. For competition, at most sites, incorporating uncertainty slightly changed individual distributions, but not the ranking of impact levels or of sites. For hybridisation, including uncertainty increased the confidence in minimal impacts. The Bayesian approach returned the same mean estimates as the bootstrapping method (as expected), but much narrower uncertainty ranges, suggesting that deriving a posterior “consensus” mean in this case might be misleading.

Discussion

Across our expert group, a sihek introduction was considered most likely to affect candidate release sites through predation, disease, and, to a lesser extent, competition. Overall, Tinian was deemed least favourable to sihek introduction, partly due to the presence of many potential bird competitors (Table 2). Experts suggested a potential sihek introduction would pose the lowest risk at Palmyra Atoll relative to other sites. However, competition, predation and disease risks for Palmyra Atoll were not considered negligible in absolute terms. Therefore, the Team decided to undertake a quantitative assessment of competition and predation risks at Palmyra Atoll, by collecting more information about the Atoll’s ecological community and predicting impacts via ensemble ecosystem modelling (Baker 2017), and to obtain a detailed disease risk analysis based on surveillance and expert veterinarian advice (Sainsbury and Vaughan-Higgins 2012). The results of these further assessments will then inform the ultimate decision to proceed or not with releases, and the extent and type of risk mitigation actions such as pre-and post-release quarantine and monitoring.

Our approach to assessing risks of sihek translocation also provides lessons for future similar conservation decisions. Formal elicitation and treatment of expert judgments are especially important when expanding a framework like EICAT–intended to assess realised impacts based on evidence–to prediction of future impacts and their likelihood. In our case, there was general agreement among experts, particularly in the overall ranking of sites, but some key differences could still be seen. For example, on Tinian, experts disagreed about the probability of major and massive impacts. These differences became more marked when including expert confidence using our bootstrapping approach, because confidence was generally high for this site. As another example, for competition impacts on Palmyra Atoll, weighting estimates by confidence increased the mean probability of minimal impacts and decreased the mean probability of moderate impacts. This suggests that experts expecting lower impacts were more confident than those expecting greater impacts.

Research on expert judgment has shown that groups generally outperform individuals, reducing error and improving decisions (Burgman et al. 2011). However, when consulting multiple experts, the challenge is what to do with disagreements and confidence. Creating consensus through discussion may be tempting, but it can be easily biased by power dynamics, groupthink, cascade effects, and simply “discussion fatigue” (Kuhnert et al. 2010). Most elicitation methods indeed do not regard consensus as an objective; rather, they seek to define uncertainty and reduce overall noise and error (Kuhnert et al. 2010). The form in which judgments are elicited then defines how they can be aggregated or compared and how confidence can be accounted for. Verbal definitions are vulnerable to linguistic uncertainty and prevent aggregation beyond simple vote counting (Game et al. 2013; Canessa et al. 2021). These limitations apply both across experts when opinions diverge and within each expert, when one feels a range of impacts could occur because they lack certain knowledge or because the ultimate outcome depends on stochastic circumstances and thus may vary. In such cases, asking an expert to indicate only one impact level ignores uncertainty altogether and causes a substantial loss of information.

Our approach circumvented these challenges by asking experts to estimate numerical probabilities directly. Experts were able not only to select all the impact levels they considered realistic but also to indicate precisely how likely they considered each level, which allowed intuitive representation and treatment of confidence both within and across experts. Numerical probabilities were superior to verbal or constructed scales in this sense because we could follow well-defined rules for aggregation, estimation, and visual representation. On their natural 0–1 scale, numerical probabilities are also easier and more transparent to directly compare among experts, communicate with partners and stakeholders, and easier to update with additional information in an adaptive framework (Runge 2011). This also applies to the expression of confidence, which we quantified as the probability of being correct (0–100), avoiding further discretization, such as “low-medium–high” confidence scales assigned to arbitrary probability levels (Probert et al. 2020; Clarke et al. 2021). Since risk is a product of outcome and probability, the estimates we obtained could be used, for example, in decision-support methods that represent uncertainty, like decision trees (Rout et al. 2013), or to analyse risk attitudes more formally, for example by using the principle of stochastic dominance (Game et al. 2013; Canessa et al. 2016a, b, c). Separating probabilities, impacts, and confidence also helps avoid hidden value judgments and improves decision-making (Game et al. 2013).

In addition to the uncertainty represented by the probabilities of different impact levels, we incorporated expert confidence by weighting experts, whereby the most confident had the greatest weight on the aggregated estimates. More commonly, uncertainty is expressed in Dirichlet-multinomial assessments by directly eliciting quantiles from experts (Zapata-Vázquez et al. 2014). In our case, this would have increased complexity beyond what we could require of our volunteer experts for a rapid assessment. Either way, confidence is self-expressed: this approach is intuitive and routinely used in the aggregation of elicited values (Hanea et al. 2018). However, it may reward overconfidence, a common cognitive bias of experts, particularly those with higher perceived expertise status (Burgman et al. 2011). Formal expert elicitation methods are designed specifically to reduce this overconfidence through consultation of multiple experts, anonymity, and explicit elicitation of confidence and uncertainty (Speirs-Bridge et al. 2010). Further, we still recommend including all methods (ignoring confidence, sampling the weighted group distribution, and evaluating a consensus mean) for comparison, as we did in our study, or carrying out sensitivity analysis on weights to identify key shifts in impact or site rankings.

The impact levels themselves remain categorical, and thus, our approach falls short of a fully quantitative analysis, such as a complete epidemiological model for specific disease threats. However, the EICAT definitions ensure greater consistency than simply using ad hoc verbal definitions and allow a faster assessment than a quantitative one could ever be. Pre-assessment training helps further reduce inconsistencies, although it increases the time and effort required by facilitators and experts. In our case, a thorough check of all responses showed no evidence of misinterpretation or misclassification of impact mechanisms or levels. Regardless, for future assessments, especially those concerning more complex ecological communities, we will consider including formal pre-assessment training in EICAT. Our approach of using a fast, broad EICAT-based screening to guide where detailed assessments of high-risk impacts and management options are required is ideally suited for urgent and high-stakes decisions like conservation introductions of species at extreme risk of extinction.

Ultimately, risk assessments cannot make decisions about translocations, only inform them. The same expected risks and benefits could be acceptable to some, unacceptable to others; the same level of confidence may be regarded by some as sufficient, by others as unreliable (Tulloch et al. 2015). This conflict is recognized for invasive species in general (Vimercati et al. 2022), but dealing with endangered species is likely to add more complex emotional layers, potentially conducive to irrational decision making (Wintle et al. 2022). Fully understanding this decision space is difficult but will help make better decisions with more satisfying outcomes in the long term. Decision makers might adopt multi-criteria decision methods to balance different objectives (Adem Esmail and Geneletti 2018). Uncertainty and risk attitudes could be treated by adopting a general precautionary principle, by seeking to minimise worst-case outcomes, or by eliciting and analysing fully defined utility functions for all stakeholders (McCarthy 2014). These solutions will require both institutional support, such as better legislation and decision structures, and local effort for specific decisions.

Our approach flagged Palmyra Atoll as the site most suitable for sihek introduction from an environmental risk perspective, and highlighted predation and disease as the impact mechanisms to prioritize for further assessment and mitigation. More generally, the EICAT was a useful framework to assess translocation-related impacts across candidate sites, especially when combined with formal expert elicitation of quantitative probabilities. Using the pre-defined EICAT categories reduced linguistic uncertainty; the formal elicitation process allowed multiple opinions and sharing of knowledge; and the quantitative expression and aggregation helped us compare opinions and assess discrepancies and confidence levels. Risk assessments for biological invasions, including but not limited to those that might result from conservation translocations, seek to predict future events in novel systems and places, so expert judgment will always be necessary. Simple and rigorous ways of obtaining and summarising those judgments can increase reliability and improve subsequent decision-making.