Introduction

Climate change is having an unprecedented effect on marine biodiversity, with recorded shifts in species phenology (Edwards and Richardson 2004), biogeography (Perry et al. 2005), and extinction risk (Dulvy et al. 2003; Barnosky et al. 2011). Specifically, increasing ocean temperatures due to rising anthropogenic carbon dioxide (CO2) emissions is recognised as one of the biggest threat to marine ecosystems and the goods and services they provide (Hoegh-Guldberg and Bruno 2010; Doney et al. 2012; Gattuso et al. 2015). The geographic distribution of a number of marine species is found to be tracking temperature changes: a meta-analysis of 857 species calculated a mean distribution shift of 72 km per decade at leading range edges, which is estimated to be an order of magnitude faster than that observed for terrestrial species (Poloczanska et al. 2013).

While some impacts are already visible and come with a moderate to high degree of certainty (IPCC 2013), efforts to understand how climate change will affect ecosystems in the future require projections using existing observations and modelling efforts (Barange et al. 2016). Predicting how species and their environments will respond to future climate change is becoming increasingly necessary to strengthen ecosystem and resource management, impact assessments, policy decisions, and conservation priorities (Guisan et al. 2013). This is highlighted by the growing number of marine climate impact publications, increasing tenfold between 1990 and 2010 (Brander et al. 2013), and the evolving sophistication of research, moving from simple, short-term experiments to complex, high resolution modelling at the species, community, and the ecosystem levels (Barange et al. 2016).

Species distribution models (SDM’s), whether correlative or mechanistic, are the dominant model class for evaluating the susceptibility of species to climate change (Guisan and Zimmermann 2000; Guisan and Thuiller 2005; Elith and Leathwick 2009; Kearney and Porter 2009; Elith et al. 2010; Kearney et al. 2010), and they are becoming an increasingly common tool in marine science (Dambach and Roedder 2011; Robinson et al. 2011). Multiple SDM’s can be used together to assess changes in community structure and biodiversity patterns (Barton et al. 2016), or they can be extended with additional parameters. For example, the Dynamic Bioclimate Envelope Model (DBEM) an extension of the SDM concept, has been used to project changes in fish size distributions (Cheung et al. 2013), as well as fisheries catch potential (Cheung et al. 2010). Models with population-dynamic (Maury 2010), trophic interaction (Chaalali et al. 2016), and biogeochemical cycling (Yool et al. 2013) parameters are also used to predict complex community and ecosystem level responses to climate change.

With numerical models and predictions comes inherent uncertainty in SDM outputs, both in relation to the parameters of the biological model, and in the climate data used to determine future ocean conditions. Though the potential uncertainty arising from the former has received attention in recent years (Thuiller 2004; Diniz et al. 2009; Garcia et al. 2012; Cheung et al. 2016b; Benedetti et al. 2017) investigating how the variation within future climate data affects species predictions has been less systematic. This may be due to the multi-faceted nature of climate projections; most studies incorporate only a subset or single strand of it (see section “The cascade of climate uncertainty” for discussion on climate projections and their uncertainties). For marine species, the extent to which SDM results are affected by the choice of climate data has been found to be minimal (Hare et al. 2010, 2012), considerable (Jones et al. 2013; Jones and Cheung 2015), or changing in significance over time (Buisson et al. 2010). However, a full and inclusive exploration of climate uncertainties and their impact on ecological predictions has yet to be achieved for any marine species.

From here on we define “climate uncertainty” loosely as the variability between future climate projections. Knowledge of how to include, control, and communicate climate uncertainty into ecological analyses remains one of the most recognised and challenging issues for climate change research in the oceans (Planque et al. 2011; Hollowed et al. 2013; Jones and Cheung 2015; Cheung et al. 2016a; Frölicher et al. 2016; Payne et al. 2016; Planque 2016). Consequently, there has been a call for a standard framework when reporting climate uncertainty to move towards creating risk assessments based on the magnitude and probability of change, ultimately facilitating discussions for mitigation and adaptation (Cheung et al. 2016a; Payne et al. 2016). However, before such an idealistic way of reporting can be realised, we must first fully understand the current trends and attitudes towards climate uncertainty when predicting marine responses to climate change. Once this has been achieved in a systematic and formal way we can highlight the factors preventing robust predictions, as well as the tools necessary for progress.

In this review, we aim to give an overview of the main sources of climate uncertainty and identify four criteria that constitute a thorough interpretation of an ecological response to climate change in relation to these (awareness, access, incorporation, communication of climate uncertainty). We then assess the literature to investigate the extent to which the marine ecology community has addressed these four criteria in their predictions. Next, we demonstrate, using a pelagic fish species, the range of future distributions that can result from using 62 future climate simulations as input into SDM analyses. We conclude by discussing solutions that may overcome current limitations and ensure that interpretations of ecological predictions are as representative and robust as possible.

The cascade of climate uncertainty

The Coupled Model Intercomparison Project Phase 5 (CMIP5) Earth System Models (ESM’s) are the latest group of climate models used within the Intergovernmental Panel on Climate Change AR5 report (IPCC 2013). They include and interweave the complex relationships between the climate, human activities, and ecosystem health, and have projected alternative futures for the twenty-first century under scenarios of varying severity (Moss et al. 2010). There are four scenarios, known as representative concentration pathways (RCP’s): 2.6, 4.5, 6, and 8.5 Watts/m2. These refer to the radiative forcing projected for the year 2100 given alternative greenhouse gas concentration trajectories (Moss et al. 2010). Within each ESM are multiple realisations which are generated by running the model with different, but equally realistic, initial conditions.

Crucially, climate models are not exact predictions for the coming decades, but rather represent an envelope that future climate could conceivably occupy (Porfirio et al. 2014). Within this envelope of future climate is a cascade of uncertainty, from the severity of RCP, to the parameters of the ESM, to the realisation number that sets the initial state of the model (Fig. 1). These three levels of climate uncertainty (scenario uncertainty, model uncertainty, and internal variability) are commonly used in the literature (Hawkins and Sutton 2009) and have been discussed elsewhere (Cheung et al. 2016a; Frölicher et al. 2016; Payne et al. 2016). Thus, here we will give a brief overview of each and summarise their relative importance at different spatial and temporal scales.

Fig. 1
figure 1

Adapted from (Wilby and Dessai 2010)

A simplified diagram of the CMIP5 structure. The global mean sea surface temperature (SST) anomalies of each simulation are shown relative to the baseline period (1982–2001). The three levels of the pyramid highlight the ‘cascade of uncertainty’ due to the different Representative Concentration Pathways (RCP), Earth System Models (ESM), and realisations (shown here as the mean of the realisations included). Coloured lines denote position of most commonly used ESM’s in the marine literature, grey lines indicate the other 11 ESM’s used in this study. The intersection on the top row of each time period is the multi-scenario, multi-model, multi-realisation mean

Scenario uncertainty

Scenario uncertainty stems from the different trajectories of future greenhouse gas emissions. Whilst other sources of uncertainty can potentially be reduced through progress in climate science, there is considerable intrinsic uncertainty in how society will alter emissions as it depends on socio-economic policies, international agreements, and technological advances (Moss et al. 2010). By 2100, scenario uncertainty dominates the variability in projections of ocean stressors, particularly for global surface pH and for sea surface temperature (SST) at low and mid latitudes (Frölicher et al. 2016).

Model uncertainty

Model uncertainty stems from how each model has been built and parameterized. Under the same radiative forcing, models can project quite different changes in climate. For this reason, model uncertainty for SST varies greatly between regions (~ 2000 km), and has greatest uncertainty in polar regions due the particular importance of climate feedbacks at these latitudes. As such, model uncertainty of SST remains of greater importance than scenario uncertainty at high latitudes until the end of the century (Hawkins and Sutton 2009). On a similar timescale, model uncertainty can dominate the variability in the projections of other ocean stressors, including primary productivity at low to mid latitudes, and subsurface oxygen at high latitudes and in low oxygenated waters (Frölicher et al. 2016).

Internal variability

Internal variability comes from the natural variability inherent in the complex climate system. Initialisation of the model at different starting states (i.e. the realisations of a model) can propagate variability throughout the model. This can be amplified by its internal variability, which includes chaotic behaviours, nonlinearities, and feedbacks (Payne et al. 2016). Internal variability dominates the variability of projections at shorter timescales for pH, SST, and subsurface oxygen; however it remains important source of uncertainty for primary productivity towards the end of the twenty-first century (Frölicher et al. 2016). Internal variability also varies regionally, for example, with pH it has a relatively greater impact in the Pacific Ocean basin and in the Southern Ocean (Frölicher et al. 2016).

Model bias and resolution

Additional complications that come with climate data include model bias (the initial over- or under-estimate of present day climate variables included in the ESM), and the available resolution of ESM’s which are often too coarse to make ecologically relevant conclusions. Both of these have the potential to artificially increase or decrease a species’ predicted response to change, rendering predictions at best highly variable, and at worst, inaccurate. These uncertainties are not the focus of this review, but see Harris et al. (2014), Ruffault et al. (2014), and Tabor and Williams (2010).

Identifying a thorough interpretation of an ecological response to climate change

There are now several reviews aimed at ecological modellers (Beaumont et al. 2007, 2008; Tabor and Williams 2010; Fordham et al. 2011; Stock et al. 2011; Gould et al. 2014; Harris et al. 2014) which provide an overview of the climate model structure, as well as outlining recommendations for their use in ecological applications (see Online Resource 1 for a summary of recommendations). The principal suggestions of these reviews are that ecologists should: (i) prepare climate model projections through bias correction and appropriate downscaling techniques, (ii) strive to achieve a multi-model, multi-RCP approach to capture the variation of, and inherent uncertainty of, climate projections, and (iii) properly communicate results so that the full range of possible outcomes are retained and passed down to end users and decision makers.

Despite these guidelines, methods to source and prepare climate data are often overlooked in the literature, and a multi-model, multi-RCP approach in applied ecological or conservation studies is rare (Porfirio et al. 2014; Goberville et al. 2015). In marine science specifically, Payne et al. (2016) suggested there is a lack of any formal treatment of climate uncertainty, irrespective of ecological sub-discipline.

To further investigate the use and reporting of climate uncertainty by marine ecologists when making ecological predictions, we grouped concepts and suggestions from the literature into four assessment criteria. First, we asses if there is a general awareness of the uncertainties arising from climate data. An awareness of the potential biases and limitations of the selected climate data is important because it demonstrates knowledge of these uncertainties and exercises caution to readers perhaps less familiar with the topic. The second assessment criterion is clear and knowledgeable access to climate data. This criterion creates a link between awareness and incorporation. Reporting information on source datasets, processing procedures, and formatting methods also encourages sharing of information and facilitates progress. Third, we assess the extent of incorporation of climate uncertainty into ecological predictions. Using multiple realisations, climate models, and scenarios controls for sources of climate uncertainty and improves the interpretation of an ecological prediction. Finally, informative communication of all possible outcomes arising from a multi-model ensemble approach is the fourth criterion. This is important, particularly for conservation and marine management, so that a range of results are made transparent to allow for informed decision making. We use these criteria to assess the robustness of ecological predictions in the marine ecology literature.

Literature review

To investigate the use and reporting of climate uncertainty in marine prediction studies, we conducted a literature search within the ISI Web of Science database, using the following criteria: (“ocean” OR “marine”) AND (“climate change”) AND (“future” OR “impacts” OR “projection” OR “prediction”) AND (“species distribution” OR “bioclimatic”). This was conducted on the 2nd of February 2017 and revealed 511 journal articles published in the last 5 years (2013–2017). From these, we only included articles that specifically used the CMIP5 simulations to predict a species, community, or ecosystem level response to changes in the marine environment. We included papers that predicted distribution shifts, changes to ecosystem function, or changes to phenology, growth, or abundance. We also included those which predicted changes to fisheries productivity and those which assessed future vulnerabilities of species, communities, or ecosystems.

For each article, we explored the extent to which they incorporated the levels of climate uncertainty found within the CMIP5 structure, specifically the scenario, model, and internal variability. This was measured by the number of RCP's, ESM’s, and realisations used within the study, respectively. For those that used more than one ESM, we also noted the choice of communication method for multi-model ensemble results. We made additional notes on whether articles reported the source of present and future climate variables, and if the article discussed general limitations of the climate data used.

The final number of articles assessed in the literature review was 48 (see Online Resource 2 for full article citations). Of these, 50% focused on predicting species distributions in the twenty-first century. Predicting impacts to fisheries were also common (8%) as was the undertaking of vulnerability assessments (7%). The majority of studies were conducted at a global scale (38%), or within the North Atlantic (25%).

Awareness

Overall, most articles demonstrated an awareness of climate uncertainty; 33 out of the 48 studies discussed the limitations of their results in the context of the climate data that had been selected as input. This varied, however, from briefly mentioning the need for other scenarios, ESM’s, and/or realisations to be included, to additional information justifying why specific ESM’s or scenarios were chosen to represent future conditions. In total, three studies (6%) gave a justification for why a specific ESM was chosen; either because of its resolution, specific parameterisations, or its skill (i.e. how closely it simulated observed data over a historical time period). Examples of poor awareness of the uncertainties in climate data were also found. There were cases in which basic information regarding the number of and name(s) of the ESM(s) used in the study were unreported (Joo et al. 2015; Seebens et al. 2016), and one article which also failed to report the emission scenario used (Saeedi et al. 2016). Each of these studies also failed to discuss how the climate data used to represent future conditions may have affected the ecological results they present.

Access

The most commonly reported sources of species occurrence data were the Ocean Biogeographic Information System (OBIS) and Fish Base. The most frequently reported sources of SST data (which was also the most frequently used environmental variable) were National Oceanic and Atmospheric Association’s (NOAA) World Ocean Atlas (WOA) and Optimum Interpolated Advanced Very High Resolution Radiometer (AVHRR-OI). For the CMIP5 climate data, 68% of papers lacked clear information on the source of their data. For the remaining 32%, official CMIP5 data portals through the Earth System Grid Federation such as the PCMDI (http://pcmdi9.llnl.gov/), BADC (http://esgf-index1.ceda.ac.uk) and DKRZ (http://esgf-data.dkrz.de) were cited, as well as the NOAA Climate Change Web Portal (https://www.esrl.noaa.gov/psd/ipcc/ocn/) and KNMI climate explorer tool (https://climexp.knmi.nl/start.cgi).

Incorporation

The amount of climate uncertainty incorporated into an ecological study generally decreased through the CMIP5 structure. Most studies used data simulated under multiple RCP's but not multiple realisations (Fig. 2). The most frequent number of RCP's used by a study was two, the most common being RCP 8.5 (83%) and RCP 4.5 (45%; Fig. 2). The number of ESM’s used as input into an SDM ranged from one to 35, with 43% of studies using only one ESM to simulate future ocean conditions. 91% of studies failed to report information regarding the incorporation of internal model variability into their study (Fig. 2). Specifically, only four studies out of the 48 declared the number or name of ESM realisations used (Deutsch et al. 2015; Bruge et al. 2016; Butzin and Pörtner 2016; Cheung et al. 2016c). Deutsch et al. (2015) was the only article to report using multiple realisations for each ESM used to simulate future climate in their predictions of future metabolically viable habitats and species ranges.

Fig. 2
figure 2

Summary of findings from a literature review of marine ecology publications which predict ecological responses under climate change. Graphs indicate the extent to which the articles incorporated climate uncertainty into analyses by measuring; (I.) the number and severity of emission scenario used, (II.) the number of Earth System Models (ESM’s) used, and (III.) whether information regarding the internal variability of ESM’s was reported

We found that certain ESM’s are utilised more than others in the literature, perhaps due to the larger number and variety of climate variables that are available for these models on the Earth System Grid data portals. The most frequently used ESM’s by ecological studies were those of the Geophysical Fluid Dynamics Laboratory (GFDL-ESM2M), Max Planck Institute for Meteorology (MPI-ESM-LR), Met Office Hadley Centre (HadGEM2-ES), and the Institut Pierre-Simon Laplace (IPSL-CM5A-MR), whilst those of the Commonwealth Scientific and Industrial Research Organization and Bureau of Meteorology (ACCESS1.3), Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC_CESM), and China’s First Institute of Oceanography (FIO-ESM) have been cited only once each. In comparison to the 15 ESM’s used within this study, the popular ESM’s lie either at the extreme high or extreme low end of projections for SST at a global scale, regardless of emission scenario and future time period (Fig. 1). For example, IPSL-CM5A-MR has one of the highest SST anomaly values under both RCP 4.5 and 8.5 whilst GFDL-ESM2M has one the lowest. Of the four most common ESM’s, MPI-ESM-LR lies below, but closest to, the multi-model mean. A similar pattern is found when comparing the Transient Climate Response (TCR) of these ESM’s. The TCR is a method of calculating the overall climate sensitivity of a model, and is defined briefly as the temperature change of a simulation at the time of CO2 doubling (Flato et al. 2013). GFDL-ESM2M has a TCR value of 1.3, well below the multi-model ensemble mean of 1.8, whilst HadGEM2-ES has the second highest TCR value compared to all other CMIP5 models at 2.5 (Flato et al. 2013).

Communication

Of the 50% of studies that used more than one ESM to project an ecological response into the future, 29% of them chose the multi-model ensemble mean or median as the only method to communicate results. A further 29% combined this metric with a measurement to show the range of results, either using the standard deviation, between-model range, or the coefficient of variation. Six studies went one step further and provided the ensemble mean, a range, as well as comparison of the SDM’s based on different climate models. It is worth noting that out of all the papers in our review that used a species distribution modelling approach, only one article successfully incorporated multiple SDM algorithms, RCP's, and ESM’s, as well as communicating the mean and range of their results, to show the probable southward range expansion of the introduced American Jackknife clam, Ensis directus (Raybaud et al. 2015).

Case study

Theoretical background

Lanternfishes (Family Myctophidae) are an abundant and species rich group of mesopelagic fishes, with over 240 species distributed globally between the surface waters and 1000 m (Catul et al. 2011). In the Southern Ocean, Electrona antarctica (Günther, 1878) is one of the most dominant pelagic fish species in terms of abundance and biomass (Greely et al. 1999) and is one of only two myctophids to exhibit a true Antarctic distribution, south of the Antarctic polar front (Duhamel et al. 2014). Strong, circumpolar frontal systems such as the polar front play an important role in delimiting different water masses as well as the spatial distribution of the Southern Ocean pelagic ichthyofauna (Collins et al. 2012; Duhamel et al. 2014). This biogeographic range coincides with the area in which model uncertainty, rather than emission scenario uncertainty, dominates the variability among climate projections of SST until the end of the twenty-first century (Frölicher et al. 2016). Thus, E. antarctica provides an opportunity to demonstrate the extent of variation that is possible to encounter when predicting species responses to climate change under multiple ESM’s, and to investigate the additional variation that can be produced when multiple realisations and emission scenarios are also incorporated into analyses, even if they are not the dominant source of climate uncertainty at the temporal and spatial scale being investigated.

Not only is E. antarctica usefully geographically located for this study, it is also of significant ecological importance. The vast abundance and biomass of this species lends it to having a key role ecosystem functioning, particularly as a dominant krill predator (Greely et al. 1999), and in turn, being an important component in the diet of many charismatic Antarctic fauna including penguins (Guinet et al. 1996), flighted seabirds (Barrera-Oro 2002), and elephant seals (Cherel et al. 2008). E. antarctica is also a major component of the diurnal vertical migration (DVM) in the Southern Ocean, in which mesopelagic fauna migrate to surface waters each night and return to depths at dawn, and so it is likely to play a significant role in the export of carbon to deeper waters (Collins et al. 2012). This importance provides another reason to investigate E. antarctica’s response to ocean warming.

Methods

Occurrence records

1186 occurrence records of E. antarctica were downloaded from the Global Biodiversity Information Facility (GBIF; http://www.gbif.org/) facilitated by the software ModestR (Garcia-Rosello et al. 2013). All occurrence records were then cleaned for unreliable data including duplicated records, records with identical latitude and longitude, and records with a latitude and longitude corresponding to a terrestrial location, leaving 950 records for analysis.

Environmental variables

Sea surface temperature (SST) and bathymetry were used as environmental variables in our species distribution modelling. By relying on SST, the modelled distributions presented here are valid for surface waters only. We acknowledge that the vertically migrating behaviour of this species means that environmental conditions at depths of up to 1000 m should ideally be accounted for in our SDM (Duffy and Chown 2017). Obtaining a three-dimensional distribution model was inhibited, however, by unreliable depth information associated with the occurrence records. Nevertheless, the majority of occurrences used were reportedly from the upper water column (0–200 m) and the well-oxygenated water in the Southern Ocean gives highly correlated environmental variables between the surface and deep layers (SST and temperature at 1000 m Pearson’s R = 0.89), reducing the chance of misrepresenting their occupied niche.

For SST, the Optimum Interpolation Sea Surface Temperature V2 dataset from the National Oceanic and Atmospheric Administration (NOAA) was downloaded from http://www.esrl.noaa.gov/psd/data/gridded/data.noaa.oisst.v2.html. This dataset includes monthly mean SST values over a global grid of 1° resolution taken from both satellite measurements and in situ recordings for the years 1981–2016 (Reynolds et al. 2002) and is also used as the observed SST baseline during processing of future climate data. Bathymetry was determined from a global 30 arc second resolution (approx. 1 km) (Becker et al. 2009) and was re-sampled to the same resolution as SST using the bilinear resample tool in ArcGIS v. 10.4 (ESRI, Redlands, California).

Future climate data

We processed SST climate projections from 15 CMIP5 ESM’s, nine of which are an ensemble of multiple (three) realisations (Table 1). Each ESM projection is represented by a 20-year period (2081–2100), under two emission scenarios. The SST variable “tos” (temperature of surface) data were downloaded from 15 ESM’s for both RCP 4.5 and RCP 8.5 emission scenarios which are available from the World Climate Research Programme data portal: https://esgf-index1.ceda.ac.uk/search/cmip5-ceda/. Up to three realisations were downloaded for each ESM with 31 realisations used in total. Each realisation gives monthly mean SST estimates on a global grid of coarse resolution between the years 2006 and 2100 (Table 1 for model resolutions). SST values from the historical realisations used to guide each ESM realisation, running from 1850 to 2005, were also downloaded from the same data portal to provide model baselines.

Table 1 Description of the 15 CMIP5 Earth System Models (ESM’s) used in this analysis

Data were processed to extract monthly mean SST values from both the 1982–2001 and 2081–2100 time slices. The 1982–2001 data were taken from both climate model and observed baselines whilst the future climate data (2081–2100) were extracted from each ESM. Mean calculations were carried out using the NetCDF Record Averager ncra command of the NetCDF Operator (NCO) utility for Linux (http://nco.sourceforge.net/). Hereafter, these time slices are referred to as baseline (1982–2001) and 2090 (2081–2100).

We implemented a bias correction procedure to minimize the difference between observed and simulated recent climates. This is a simple and quick method to deal with bias uncertainties when processing many large, global datasets and is similar to the change-factor method described by Tabor and Williams (2010). Specifically, projected SST anomalies from a model baseline (1982–2001) were added to the baseline derived from the observed SST dataset. The processing workflow proceeds as follows (Fig. A1 in Online Resource 3). For each ESM, the projected change in SST for each grid cell for a future time period (i.e. the 2090 anomaly) was calculated by creating a new raster in which the model baseline cell values were subtracted from the 2090 cell values. Each anomaly raster was then added to the observed baseline SST raster. In this way, the projected change in SST simulated by the model is retained for each grid cell, but is shifted on to the more realistic baseline, giving an adjusted projection of future SST across the globe.

All environmental variables for both present and future time periods were cropped to a latitudinal extent of 30–75°S and interpolated to 0.25 × 0.25 degrees resolution (~ 9 km) using a regularized spline interpolation of vector points implemented in ArcGIS v.10.4. This method uses a mathematical function that minimizes overall surface curvature, resulting in a smooth surface appropriate for data such as temperature (Hijmans et al. 2005). This resolution is common in the marine literature (Fly et al. 2015; Alabia et al. 2016; Byrne et al. 2016) and is appropriate to capture both the large distributional range of pelagic species and the dynamic oceanography of the Southern Ocean.

Species distribution model

Occurrence records and environmental predictors were fitted to the species distribution modelling algorithm MaxEnt (Phillips and Dudik 2008; Elith et al. 2010, 2011). MaxEnt models the environment from a range of random locations across the study region (“background sites”) to discriminate against the environment at locations where species are known to be present (“presence sites”). In doing so, the model predicts the relative suitability of the environment across the study region. MaxEnt was chosen for its repeatedly high performance against other SDM algorithms (Elith et al. 2006; Ortega-Huerta and Peterson 2008; Monk et al. 2010), its popularity in the literature, ease of use, and accessibility. Furthermore, MaxEnt’s capacity to use presence-only data is appropriate because of the high potential for errors under a presence–absence approach for E. antarctica, given the relatively low sampling effort in polar regions and the species net avoidance behaviour (Collins et al. 2008).

The SDM was run using a 10-k cross-validation method and 30% of occurrence data were reserved for model testing. Auto features were selected with the following settings: regularisation parameter = 1, maximum number of iterations = 500, and number of background points = 8000. Only one occurrence record per grid cell was used in the model. The predictive performance of the model was evaluated using both area under the receiving operator characteristic curve (AUC) values and the model’s omission rate.

This present day model was then used to predict the future distribution of E. antarctica under each climate simulation for the year 2090 (31 simulations × two scenarios, equalling 62 simulations in total). Logistic outputs, which give the conditional probability of occurrence between 0 and 1 for each grid cell in the study region, were then thresholded using the Reclassify tool in ArcGIS to create a binary presence–absence map of E. antarctica’s future distribution. For all outputs the threshold used was 0.41 as informed by the “Equal test sensitivity and specificity” threshold recommended by Liu et al. (2005).

We then quantified the variation in the predicted suitable area amongst the 62 outputs. The predicted area of suitable habitat, taken as the area with a probability of presence above the 0.41 threshold, was calculated for each output using the R package “raster” (Hijmans 2015) and subtracted from the present day suitable area. To quantify the spatial variability in future predictions, pairwise range overlap metrics for each of the 62 future distribution maps were calculated using the range overlap function in the software ENMTools (Warren et al. 2010). To visualise how the use of different model realisations can affect SDM results, the future distribution maps created from using each realisation of an ESM were summed together, thus showing the level of agreement between them (i.e. a value of three = high agreement in the future distribution of the species; the location is predicted to be suitable when using any of the three realisations, a value of one = low agreement in the future distribution of the species; the location is predicted to be suitable by only one of the three realisations). Similarly, the future distribution maps created from using each of the 15 ESM’s were summed together to show the level of agreement between them, ranging from 15 (the location is predicted to be suitable regardless of the ESM used as input into the SDM) to one (the location is predicted to be suitable by only one ESM).

Results

The present day distribution model gave strong model performance based on AUC and omission rate metrics (mean AUC = 0.829 ± 0.009 SD, omission rate = 0.001 ± 0.003 SD). Under this model, E. antarctica has an upper latitudinal distribution of ~ 55°S in the Pacific region of the Southern Ocean, to ~ 45°S in the Atlantic region, coinciding closely with the polar front (Fig. 3) and has a predicted current suitable habitat of 17,503,869 km2 based on the 0.41 threshold criteria. Future SDM results based on 62 climate simulations all indicate a change in the future suitable habitat for E. antarctica, but the direction and severity of this change is highly dependent on the choice of emission scenario, ESM, and ESM realisation (Figs. 4, A2 and A3 in Online Resource 4).

Fig. 3
figure 3

The present-day distribution of Electrona antarctica between 35 and 75°S as predicted by the species distribution model algorithm MaxEnt. Output is the logistic conditional probability of presence ranging from 1 (high probability of occurrence) to 0 (low probability of occurrence). The position of the main oceanographic fronts in the Southern Ocean are shown; Subtropical Front (dashed black line), Subantarctic Front (black line), Polar Front (red line), Southern Antarctic Circumpolar Current Front (black dotted line)

Fig. 4
figure 4

The percentage loss or gain of suitable habitat area for Electrona antarctica by 2090 (2081–2100) relative to 1992 (1982–2001) as predicted by species distribution models using 31 different climate simulations from 15 global climate models and two emission scenarios, RCP 4.5 (a) and RCP 8.5 (b). Simulations are grouped by climate model and by the severity of the predicted change in suitable area. Realisations of the same model are denoted by realisation (r) number

Scenario uncertainty

The severity of E. antarctica’s response to climate warming is influenced firstly and inherently by the choice of emission scenario as input into the SDM. By 2090, under the stabilising scenario RCP 4.5, SDM’s predict that E. antarctica will lose, on an average, 6.21% of suitable habitat. These increases to a loss of 13.11% when the more severe scenario RCP 8.5 was used as input (Fig. 4). The variation amongst SDM outputs is also elevated when using simulations from RCP 8.5, from a standard deviation of ± 5.96% under RCP 4.5 to ± 10.24% under RCP 8.5.

Model uncertainty

Much of the variability in predictions of E. antarctica’s future distribution can be attributed to the climate model used to represent future climate conditions (Fig. 4). At the extremes, the use of certain ESM’s predicted a loss of suitable area of up to 28.84% (CCSM4, RCP 8.5) to an increase in suitable area of 2.91% (MPI-ESM-LR, RCP 8.5). This variation equates to differences in suitable habitat of over 5 million km2. Although most ESM’s projected a more severe change to E. antarctica’s distribution under RCP 8.5 (either losing or gaining more suitable area), the SDM based on the GFDL-ESM2G climate model predicted a loss of area of 13.38% under RCP 4.5 and only 5.77% under RCP 8.5 (Fig. 4). Range overlap of future distributions based on different ESM’s was on an average 87% (± 0.07) for RCP 4.5 and 88% (± 0.08) for RCP 8.5. Lowest range overlap of 64% was between predictions based on CNRM-CM5-2 and GFDL-ESM2G climate models (Table 2). Overall, spatial agreement in future distributions under different ESM’s is highest in the core range of E. antarctica and decreases towards range edges, specifically in the leading edge surrounding the Western Antarctic Peninsula and Weddell Sea (Fig. 5).

Table 2 Matrix of pairwise range overlap values for 30 future distribution maps of E. antarctica, each generated by one of 15 Earth System Models (ESM’s) under emission scenario RCP 4.5 (upper triangle) or RCP 8.5 (lower triangle)
Fig. 5
figure 5

Quantifying the level of agreement in predictions of E. antarctica’s future distribution, when future climate conditions are simulated by (I.) 15 different Earth System Models (ESM’s), and (II.) three realisations of each ESM. Predictions under both emission scenarios RCP 4.5 and RCP 8.5 are shown

Internal variability

The variability in predictions of E. antarctica’s future distribution that can be attributed to the internal variability of an ESM was highly dependent on the ESM used (Figs. A2 and A3 in Online Resource 4). In the most extreme case (when using different realisations of the CNRM-CM5-2 climate model, RCP 8.5, as input into the SDM), the area of predicted suitable habitat differs by almost 700,000 km2 between the realisations, ~ 4% of E. antarctica’s total range (Fig. 4). Additionally, when using different realisations of the climate model MPI-ESM-MR, RCP 4.5, two out of the three realisations predicted a loss of suitable area (3.37 and 2.53%) whilst one realisation predicted a slight increase in area (0.45%; Fig. 4). When comparing SDM outputs that used different realisations of the same ESM, range overlap in the predicted distributions varied from being highly consistent (CCSM4, RCP8.5 = 99.9% [± 0.004]) to slightly variable (MPI-ESM-LR, RCP 4.5 = 95.6% [± 0.012]; Table 2). There was generally high spatial agreement in the predicted distributions of E. antarctica when different realisations of the same ESM were used as input to SDM’s (Fig. 5). However, this agreement tended to decrease in leading range edges, specifically around the Weddell Sea and Ross Sea regions (Fig. 5).

Discussion

Using a case study species, Electrona antartica, and by deconstructing future climate data to three sources of uncertainty, we have demonstrated the large variability in predictions of species responses to climate change that can arise from incorporating internal, model, and emission scenario uncertainty into analyses. Predicted loss of habitat, on an average, doubled under a more severe Representative Concentration Pathway. Species predictions based upon different ESM’s ranged from substantial habitat loss of ~ 30%, to a marginal gain of 3%. When basing species predictions on multiple realisations within individual ESM’s, there was generally high spatial consistency, though in one instance SDM outputs had levels of variation which was still enough to give opposing conclusions to the species response to change.

To our knowledge this is the first example of a systematic exploration of the effect that all three levels of climate uncertainty can have when predicting the future distribution of a marine species. Previous studies have focused on understanding the effect of using multiple RCP’s and ESM’s when simulating future climate conditions, for example with the commercially important grey snapper Lutjanus griseus (Hare et al. 2012), or on comparing structural uncertainty of SDM results between different ecological and climate models (Jones and Cheung 2015; Benedetti et al. 2017). More broadly, our findings of large variation in SDM outputs caused by the choice of ESM used to represent future conditions is in line with similar analyses, for example, on European trees (Goberville et al. 2015) and plants (Thuiller 2004), freshwater fish assemblages (Buisson et al. 2010), and African vertebrates (Garcia et al. 2012). Beaumont et al. (2007) investigated the effect of incorporating internal variability when predicting the future distributions of Australian butterflies, and similar to our findings, reported variability in SDM results due to multiple realisations of a single climate model.

It is clear from these examples that choosing which climate data to base an ecological prediction upon must be made carefully, and that using a single realisation, ESM, or emission scenario as input for an ecological prediction can lead to misleading and uninformative results. Yet from our literature review we find evidence of only moderate incorporation of climate uncertainties, with some receiving greater attention than others. Articles were more likely to include multiple RCP's than ESM’s, and over 90% of studies failed to report information regarding the realisations or initializations used, with only one study explicitly stating that they had incorporated multiple realisations into analyses. The lack of similar reviews in other ecological disciplines means we are unable to compare the marine community’s efforts to others. However, half of the studies investigated here had based their predictions on two or more ESM’s, 10% more than was found by Porfirio et al. (2014) when investigating the terrestrial SDM literature dating between 1982 and 2013. Additionally, our findings are in support Payne et al. (2016) that climate uncertainty is generally treated as one element, and that the internal variability of ESM’s is rarely accounted for in the marine ecology literature.

Whilst the marine literature had a higher percentage of studies using multiple ESM’s than was recorded for terrestrial studies, the most commonly used ESM’s tend either to over- or under-project future SST relative to a multi-model mean (Fig. 1) and have previously been found to have relatively high levels of internal variability for marine variables (Frölicher et al. 2016). Although studies that use these four common ESM’s together may be incorporating a broad range of possible SST conditions, the reliance of any one of these alone (which was the case for 40% of studies that used a single climate model) may affect the magnitude of the ecological response being investigated. Indeed, this study highlights that climate models can project extremely different rates of change, both temporally and spatially, and that these differences are reflected in the ecological predictions that are made. For example, the climate models which generated predictions of extreme loss or gain of E. antarctica habitat are also those that have the highest and lowest rates of SST warming in the Southern Ocean, respectively. Regions that had decreased agreement between SDM outputs (e.g. the Weddell and Ross Seas) are also regions characterised by high variability in climate model projections of SST.

Variability in predicted species responses to climate change due to different emission scenarios is arguably the most obvious and inherent source of climate data uncertainty, and by the end of the twenty-first century (2070–2100), which is the most frequently used time period for marine prediction studies, scenario uncertainty is expected to dominate over the other sources for a large proportion of the Earth’s surface (Hawkins and Sutton 2009). Thus, it is perhaps unsurprising it has been given priority in the literature. Yet our case study demonstrates large variation in a species response to climate change, and that when multiple realisations are available, using all three levels of uncertainty to simulate the future climate is the most appropriate action. It is often necessary to make compromises in which uncertainties can be integrated into a study due to the amount of data processing, resource constraints, or when a research question focuses on other sources of uncertainties. If such decisions are to be made, ecologists must consider how climate uncertainties interact and which ones are of greatest concern for their study region, time period of interest, and the environmental variables being used (Cheung et al. 2016a; Frölicher et al. 2016).

Once used, predicted distributions obtained under multiple RCP's, ESM’s, and realisations must be communicated transparently and effectively to convey the full range and confidence of the ecological predictions being made. There are multiple ways of summarising ensemble results, including the area in which at least one model, or all models, predicts species occurrence (bounding box), the area in which 50% of predictions show overlap (consensus forecast), and the probability of distribution change as a probability density function (Araujo and New 2007; Harris et al. 2014; Porfirio et al. 2014). A range of these were found in the marine literature and the presence of a variety of communication methods is promising but also highlights that there is no standard approach in communicating species predictions. Half of studies from the literature review chose only to use the multi-model mean, despite advice that this should be avoided in most circumstances (Beaumont et al. 2007). To summarise the results for E. antarctica, we present future distribution maps based upon realisation and ESM agreement (Fig. 5). This method has been favoured among conservation managers due to it showing clear priority areas for conservation (Porfirio et al. 2014). It also conveys the range of potential outcomes and the level of confidence in the findings.

Uncertainty in predictions of ecological responses can also arise from parameters and data used in the biological model, for example, in evaluation of parameter estimates, model performance, or the spatial and temporal scale of the model (Beaumont et al. 2008). These sources of uncertainty were reviewed in a similar manner to the study by Planque et al. (2011). The authors reviewed the marine literature to determine if studies predicting species distributions had adequately reported uncertainty arising from modelling procedures, concluding that there was little evidence of sufficient reporting and that predictions were not as reliable as previously assumed.

These issues can be somewhat ameliorated by recent developments in the wider SDM literature, where how to improve predicted distributions is now widely discussed (Araujo and Guisan 2006; Beaumont et al. 2008; Elith and Leathwick 2009; Elith et al. 2010; Beale and Lennon 2012; Porfirio et al. 2014; Jarnevich et al. 2015). There are also specific publications guiding ecologists through many of the common sources of uncertainty, for example, regarding observation bias (Wisz et al. 2008; Stolar and Nielsen 2015), modelling approaches (e.g. empirical and mechanistic) (Kearney et al. 2010), algorithm settings (Merow et al. 2013; Boria et al. 2017) and comparisons (Elith et al. 2006; Ortega-Huerta and Peterson 2008; Guillera-Arroita et al. 2015), evaluation metrics (Lobo et al. 2008), and collinearity (Braunisch et al. 2013; Dormann et al. 2013). Given the amount of literature addressing this subject, we focused here only on the use of climate uncertainty, but stress the need to account for all sources of uncertainty and incorporate, where appropriate, multiple modelling algorithms. There is recent evidence of this being applied to marine ecology research (Jones and Cheung 2015; Cheung et al. 2016b; Legrand et al. 2016).

One insight from our review is that a major limitation when creating robust predictions of ecological responses in marine ecology is having adequate access to CMIP5 data, and/or knowledge of how to process raw climate data. In over 65% of studies, sourcing of the CMIP5 climate data used was not reported. When it was, the data are often in a format (NetCDF) that requires complex processing to become smaller, manageable raster files most commonly used in distribution modelling. Though this is a problem that could be encountered by all ecologists, it is particularly restrictive in the marine community as databases that contain a broad range of future environmental variables (not only SST but O2, pH, salinity and primary productivity) from multiple ESM’s in a rasterized format are largely lacking or only provide data from the previous CMIP3 modelling efforts (though see interactive tools such as NOAA’s climate change web portal and Clim System’s SimCLIM for ArcGIS). Further development of these tools to include more ocean variables and realisations, greater communication between marine and climate scientists, as well as increased data sharing amongst marine ecologists will be necessary to improve data clarity and accessibility (Beaumont et al. 2008; Harris et al. 2014; Payne et al. 2016). As a step towards this view, the 62 global SST simulations used in this study will be made available via the Dryad Data Repository (doi:10.5061/dryad.4f98t), providing a resource for ecological, conservation, and policy-driven studies in rapidly changing marine environments.

Conclusions

Predicting species and ecological responses in the face of climate warming can be a useful exercise when implemented correctly, with growing practical applications. Whilst it is impossible to remove climate uncertainty, and much will only be helped by advances in climate science (e.g. in parameterisation and resolution), ignorance of these uncertainties by ecologists can be highly detrimental to those acting upon published results. We have reviewed the marine literature and found evidence that the marine ecology community is only moderately addressing climate uncertainty, despite a general high awareness of it, with improvement necessary in the incorporation of internal variability, broader representation of ESM’s, and clearer communication of results. Moreover, with our case study species, Electrona antarctica, we demonstrated that a full and transparent incorporation of climate uncertainty is possible, and that it plays a crucial role in creating reliable predictions. We identified possible solutions which may overcome current limitations in utilising climate data. This includes easier access to processed climate data that includes, to some extent, all levels of climate uncertainty, which would provide an incentive for marine ecologists to increase the amount of uncertainty being incorporated into their analyses. This should, in turn, promote clearer communication of all possible outcomes and an overall increase in the quality and standard of studies that predict ecological responses in a changing ocean.