Sources and implications of deep uncertainties surrounding sea-level projections

Long-term flood risk management often relies on future sea-level projections. Projected uncertainty ranges are however widely divergent as a result of different methodological choices. The IPCC has condensed this deep uncertainty into a single uncertainty range covering 66% probability or more. Alternatively, structured expert judgment summarizes divergent expert opinions in a single distribution. Recently published uncertainty ranges that are derived from these “consensus” assessments appear to differ by up to a factor four. This might result in overconfidence or overinvestment in strategies to cope with sea-level change. Here we explore possible reasons for these different interpretations. This is important for (i) the design of robust strategies and (ii) the exploration of pathways that may eventually lead to some kind of consensus distributions that are relatively straightforward to interpret.


Introduction
increase the frequency of harmful floods (Tebaldi et al. 2012). The management of the associated risks requires a sound understanding of potential local SLR, including low probability, high impact events (Kopp et al. 2014;De Vries et al. 2014;Grinsted et al. 2015).
Local SLR can significantly deviate from the global signal due to non-oceanic effects such as subsidence and post-glacial rebound, and due to oceanic effects (Slangen et al. 2012). Changes in ocean circulation, heterogeneous density changes, and mass-loss of large ice bodies (e.g., affecting the gravity field, Earth's rotation, and lithospheric flexure) may cause distinct spatial patterns. Local SLR projections thus require a separate treatment of these major components, including thermosteric expansion and mass loss of ice sheets, ice caps, and glaciers.
Sea-level projections are deeply uncertain (Hulme et al. 2009;Ranger et al. 2013;Applegate et al. 2015;Oppenheimer et al. 2016). Deep uncertainty occurs when experts and/or decision makers Bdo not know or cannot agree upon the system model relating actions to consequences or the prior probabilities on key parameters of the system model^ (Lempert and Collins 2007). Experts disagree on the best methods to assess the uncertainties (Church et al. 2013;Moore et al. 2013), and their subjective probability estimates of SLR are widely divergent .
When confronted with deep uncertainties, analysts have to make a choice (see, for example, Budescu et al. 2014). One option is to ignore the deep uncertainties, i.e., to present a single pdf, perhaps accompanied by a disclaimer that severe changes outside the presented range are possible. From decision-making context, this alternative seems less favorable because decision makers tend to act differently when confronted with deep uncertainties or imprecise probabilities rather than with well-defined probabilities (e.g., Ellsberg 1961;Budescu et al. 2014). A second option is to try to achieve consensus and condense the deep uncertainties into a single Bconsensus^distribution. A third option, for example if a consensus appears not (yet) possible, is to present decision makers with the key aspects of this deep uncertainty.
Structured expert judgments can be useful in case of ambiguity, disagreeing models, and lack of empirical evidence (Oppenheimer et al. 2007;Aspinall 2010;IAC 2010;Mastrandrea et al. 2010;Bamber and Aspinall 2013;Horton et al. 2014). According to Cooke and Goossens (2008), Bstructured expert judgment refers to an attempt to subject the decision process to transparent methodological rules, with the goal of treating expert judgments as scientific data in a formal decision process.^Expert agreement is not the objective of structured expert judgment. Rather, it intends to explore the range of views and to help build a political or rational consensus (Cooke and Goossens 2008). Rational consensus can be achieved by means of a method that is defined and agreed-on before eliciting the experts (Cooke and Goossens 2008;Aspinall 2010). It is, however, non-trivial if and how to combine expert opinions (Morgan and Keith 1995). As a consequence, the reliability of structured expert elicitations is often questioned (e.g., Keith 1996;Church et al. 2013;Gregory et al. 2014;Clark et al. 2015).
As an alternative to structured expert judgment, the IPCC's Fifth Assessment Report (hereafter, AR5) presents a likely range. According to the Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties, a likely outcome means that Bthe probability of this outcome can range from ≥66% (fuzzy boundaries implied) to 100% probability^ (Mastrandrea et al. 2010). The likely range explicitly builds on the agreed-on, current state of knowledge. Aiming for scientific rigor and consistency with literature, the IPCC authors have chosen not to account for poorly understood mechanisms, like the collapse of the marine-based sectors of the Antarctic ice sheet, in the likely range (Church et al. 2013;Gregory et al. 2014). This range has been criticized for being overconfident and ignoring semi-empirical model studies (Kerr 2013;Rahmstorf 2013;Grinsted 2014). However, two recent studies seem fairly consistent (Mengel et al. 2016;Kopp et al. 2016) and many local projections (partly) rely on AR5 and its model ranges (e.g., Kopp et al. 2014;De Vries et al. 2014;Grinsted et al. 2015).
Yet, a likely range (i.e., spanning 66% probability or more) leaves users considerable room for interpretation, resulting in large differences between the various assessments ( Fig. 1). For example, the climate change scenarios for the Netherlands (KNMI14) (Van den Hurk et al. 2014;De Vries et al. 2014) project a relatively small 90% probability range (red bars). In contrast, Kopp et al. (2014) project a 90% probability range that is twice as large (dark green bars). Grinsted et al. (2015) projects an even larger 90% probability range (purple bars). These studies present clear justifications of their methodological choices, but the consequences for projected ranges and derived decisions receive relatively little attention. The discrepancies between the uncertainty ranges of different studies are a sign of deep uncertainty. Understanding the sources of this deep uncertainty can be a key step to better support risk and decision analysis because this understanding may (i) help to build consensus and (ii) can inform the design of robust strategies (e.g., Hall et al. 2012;Singh et al. 2015;Hadka et al. 2015).
Here, we explore the main reasons for the different interpretations of exactly the same information on SLR. This insight may be useful for designing strategies to cope with the deep uncertainties surrounding sea-level projections. In the longer run, eliciting the reasons for these divergent projections may help reduce ambiguities or help build rational consensus. To demonstrate and quantify these effects, we first explore how interpreting a given uncertainty in projections as representing differing likelihoods impacts probabilistic projections. Next, we discuss the potential role of structured expert judgment. Finally, we explore how structured elicitation might be utilized to reduce the current deep uncertainties and help to better inform risk and decision analyses.

The interpretation of the IPCC's likely range
One potential problem with the IPCC's likely range is that users can interpret it to span 66% up to 100% probability, as consistent with AR5. Probabilistic sea-level projections can depend critically on this probabilistic interpretation. This is illustrated in Fig. 2, which shows how interpreting a given uncertainty range (spanned by the horizontal yellow lines) as representing different likelihoods (shown on the x-axis) extrapolates to differing 66, 90, and 99% probability ranges (y-axis), assuming normal and lognormal distributions for the different contributions (see the Appendix for the derivation of this figure). As seen from Fig. 2, uncertainty range increases more than linearly with decreasing probabilistic interpretation. This dependence on interpretation can largely explain the factor two difference in the projected 90% probability range between KNMI14 (red vertical bar) and Kopp et al. (2014) (green vertical bar). KNMI14 implicitly interprets the likely range as the 90% probability range, i.e., that the sealevel rise will be within this range with 90% probability. Many other studies (e.g., Kopp et al. 2014) interpret the likely range as the 66% probability range. The rationale given by KNMI14 is to be methodologically consistent with AR5 and internally consistent within KNMI14, and in this way to provide a widely accepted and actionable common framework for climate change adaptation in the Netherlands. The 66% probability interpretation is typically not explicitly motivated nor referenced. The 66% probability interpretation can make sense, for example if the objective is to produce wide (conservative) uncertainty ranges. From a robust decision-making perspective, conservative projections may be preferable to overconfident projections (see for example the discussions in Herman et al. 2015 andBakker et al. 2016 Fig. 2 Projected probability range of global sea-level change between 1995 and 2090 following RCP8.5 (y-axis) as function of the probabilistic interpretation of the IPCC's likely range (x-axis). The IPCC communicates the uncertainty of future sea-level as a likely range (yellow arrow, lines). Future sea-level rise is expected to fall within this likely range with 66% probability or more. Any probability between 66 and 100% is consistent with this IPCC assessment (black arrow). The projected probabilistic uncertainty ranges (blue shades and dashed lines) depend strongly on the probabilistic interpretation of the likely range (x-axis). For example, the projected 90% probability range of KNMI14 (red vertical bar) is about half of the 90% probability range projected by Kopp et al. (dark green, vertical bar) mainly because of a different probabilistic interpretation of IPPC's likely range The likely range (i.e., spanning 66% probability or more) gives no clear lead on how to estimate higher quantiles, like 1:100, that are decision relevant (Kopp et al. 2014). The applied likely interpretation and distribution function largely determine the extrapolation. Yet, the scientific foundation for this methodological choice is largely unclear.
3 Expert elicitation and the role of ice sheets Some have attributed the deep uncertainties surrounding sea-level projections to the response of the large ice sheets (Church et al. 2013;Bamber and Aspinall 2013). For instance, simulated and elicited projections of the Antarctic ice sheet contribution for the twenty-first century range from a few centimeters of global sea-level drop (Church et al. 2013) to an implied drastic several meters rise resulting from an almost complete disintegration of the West-Antarctic ice sheet (WAIS) (Pollard et al. 2015).
Probabilistic statements on high ice-sheet contributions are, however, controversial (Gregory et al. 2014;Clark et al. 2015). Different approaches result in widely diverging projections, as shown before in Fig. 1. For example, KNMI14 (red bars) applies, in line with AR5, physically reasoned/modeled upper limits as proposed by Katsman et al. (2011), whereas Perrette et al. (2013) (cyan bars) provide much larger uncertainty ranges based on a semiempirical approach. Alternatively, some studies apply expert judgments (e.g., Grinsted et al. 2015;Kopp et al. 2014), notably elicited by Bamber and Aspinal (2013, hereafter BA13). Yet, different combination rules result in large differences too.
BA13 applies a structured expert elicitation to inform low-probability SLR projections. For this, they elicited from 13 experts their subjective estimates of, amongst others, the 5, 50, and 95% quantiles of future ice sheet decrease rates in 2100 in response to RCP8.5. Assuming a constant acceleration in time, they subsequently estimate the 5, 50, and 95% quantiles of the cumulative sea-level contributions in the year 2100 (Fig. 3, black crosses). For comparison and extrapolation (needed to estimate lower probability events such as 1:100 and 1:1000), we fitted beta distributions applying a 3.3-m (i.e., total WAIS collapse) upper bound (black lines). In order to provide a rational consensus, BA13 applies performance-based weights (i.e., weights based on the performance on seed questions on quantile estimations) to combine the expert opinions (purple dots) (Cooke 1991). This method has been shown to be superior over several other methods such as equal weighting and quantile weighting (Cooke and Goossens 2008;Bamber et al. 2016). Grinsted et al. (2015) explicitly use the BA13 expert consensus to estimate the uncertain contribution of the Bignored^processes and add the BA13 estimates to the AR5's likely range. Kopp et al. (2014), on the other hand, acknowledge the scientific consensus of AR5 and only use BA13's to estimate the higher quantiles. Aiming for a smooth extrapolation, they fit a lognormal distribution to BA13's expert consensus and scale this distribution down to match the AR5's likely range. As a consequence, the 90% probability range of Kopp et al. (2014) is less than half of the range of Grinsted et al. (2015) (Fig. 1).
In a critical assessment of BA13's elicitation, De Vries and Van de Wal (2015) propose a third approach. Concerned about weighting of experts and a too large influence of outlier opinions, they suggest to use a median estimate (the red dots), a method previously applied by Horton et al. (2014). Median-pooling is shown to be especially powerful in case of a large group of experts, relatively little over-confidence (Park and Budescu 2015;Gaba et al. 2016) and when the intended decision is not driven by tail-behavior of the uncertainty (Hora et al. 2013). Otherwise, the median approach may result in over-confident projections (Park and Budescu 2015).

Conclusions and discussion
We have shown how different, all arguably reasonable, interpretations of the imprecise information of the IPCC can result in widely divergent and deeply uncertain sea-level projections. Approaches to address this problem by eliciting and combining (subjective) information from experts have provided useful insights, but still result in deeply uncertain projections.
The examples illustrate that the construction of a consensus estimate from divergent expert assessments can be subject to considerable structural (and deep) uncertainty. This is consistent with the previous assessment that there is no Bobjective basis for combining expert opinion ( Keith 1996). Given this deep uncertainty, many (e.g., Keith 1996;Keller et al. 2008;Lempert et al 2012) have argued that a robust strategy, i.e., that performs well over a wide range of plausible futures/views (Lempert et al 1996), may be preferable over optimal strategies. Yet, depending on the applied decision criterion, the assessed robustness of a strategy can critically hinge on the range of views considered. Thus, robust strategies can also be very sensitive to outlier opinions and the way divergent expert assessments are aggregated (or not).
Many studies are silent on the aspect of deep uncertainty, for example by providing a single probability density function. This ignorance may lead to inconsistent decisions. Decision makers' preferences often change when confronted with deep uncertainty (Ellsberg 1961;Budescu et al 2014). Improving its communication, e.g., by providing multiple plausible pdfs, Survival functions (1 minus the cumulative distribution function) of the cumulative West-Antarctic ice sheet contribution to sea-level rise for the twenty-first century assuming RCP8.5 based on expert elicited quantile estimates and different combination strategies of quantile estimates. Black crosses are individual quantile estimates (5, 50, and 95%) of 13 experts elicited by Bamber and Aspinall (2013). The colored dots represent different strategies to aggregate the individual estimates. PerfWts (purple) is the aggregated estimate of Bamber and Aspinall (2013) applying expert performance weighting. EqualWts (green) represents their equal-weight aggregation (Bamber and Aspinall 2013), and Median (red) median aggregation. Orange (MaxRange) represents enveloping, i.e., the use the minimum estimate for the lower bounds and the maximum estimate for the upper bounds (Park and Budescu 2015). The continuous lines are fitted beta-distributions assuming an 3.3-m (i.e., total disintegration) upper limit. The dashed blue line is the uniform distribution can help to inform the design of more robust risk management strategies. Effective communication of deep uncertainties, however, depends strongly on the decision-context. Therefore, an efficient representation requires a tight interaction between decision analysts, scientists, and decision makers.