1 Introduction

Within the overarching discipline of climate science, there is a field of research devoted to establishing the causal relationships between anthropogenic climate change and extreme weather events that sits at the core of that discipline. This field is called detection and attribution. Traditionally the concern has been to detect a climate change signal in a noisy and variable climate system and correctly attributing this change to human emissions of greenhouse gases or other human changes in the Earth’s radiation balance.Footnote 1 More recently increasing attention and effort has been put into pursuing a different set of questions, namely, attributing climate change as a cause, or partial cause, to single extreme weather events. This is often called extreme event attribution or single event attribution. In what follows, we shall use the acronym EDA (event detection and attribution) to designate the sub-field of detection and attribution (DA) science that is devoted to single events.

Pursuing such questions has often been thought of as being of considerable value, not only within science itself, but also in the pursuit of various broader social aims, for example, in attempts to build a comprehensive framework for climate change adaptation. With extreme weather events becoming both more common and more severe, the issue typically arises when considering the consequences of a hurricane or a flood which we could not or have not prevented through mitigation—nor successfully adapted to. Moreover, scientifically establishing causal links between climate change and individual extreme events has in the offing a framework under which legal liability for damages could be pursued which may well provide a viable route to pressure emitters of greenhouse gases or governments into action (Allen 2003).

However, there is no guarantee that approaches and methodologies that have proved adequate for other tasks will be equally satisfactory for this one. For a few years, a dispute flared up within the EDA community concerning the appropriate methodology for attributing climate change as a cause of extreme events. The verdict nowadays might perhaps be that none of the present approaches seem to provide the answer to all reasons we have for engaging in attribution. For instance, in a recent paper, Winsberg et al. (2020) have analyzed this dispute. They argue that what may initially seem like a genuine methodological disagreement on what kind of approach is best for a particular kind of situation is actually a dispute about values. To a large extent, the discussion has centered on type I and type II errors.

In this paper, we adopt a pluralist strategy based on the idea that in single event attribution we inevitably have to do with problem-feeding (Thorén and Persson 2013) to various scientific fields and solution-feeding to different local attribution contexts (sometimes motivated for completely different reasons). We do this because we do not think that there is reason to expect one general solution to all the different single event attribution problems EDA face and because tools are needed to manage the plurality of values involved. The present framing of the discussion in terms of type I and type II errors is not optimal, we think, even if it has been a key motivating factor for the climate scientists involved. The classical definition of type I vs type II error applies to the same prediction, e.g., does or does not this person have a certain kind of illness? Attribution questions, however, are not the same seen through the different approaches. Each approach frames (or reframes) the question differently, which seems to make the line of argument developed here about the need for attention to problem–solution coordination relevant. Here we provide the first outline of such a pluralist strategy.

2 Problem-feeding

Before we proceed to detail the relationship between different approaches to single event attribution, it is useful to say a few words about the framework we will deploy. The notion of problem-feeding is elaborated in Thorén and Persson (2013) and designates an interdisciplinary relationship where problems are exchanged between disciplines. Schematically portrayed, problem-feeding happens when one discipline formulates a question or runs into a problem that can only be solved by, or with assistance of, another discipline. If the problem is transferred to the discipline with the appropriate resources, then we say that problem-feeding has taken place. If the receiving discipline solves the problem and that solution is used in the discipline where the problem arose to begin with, we say that the problem-feeding has been bilateral. In short, problem-feeding with solution-feeding is bilateral; otherwise, it is unilateral.

This has been suggested as a normative model for interdisciplinary relationships (see Thorén and Persson (2013)), but here we shall mostly use the framework to analyze the uses of problems and problem-solutions within and outside of the context in which they are produced.

The point of using the framework in this context is to point out that approaches to solving problems, producing answers, and putting those answers to use can be spread out over different contexts. This is of consequence, or so we shall argue, when considering whether approaches are competing or complementary, since what passes for an adequate solution to a problem is context-dependent. This gives rise to specific considerations within EDA science as for the most part the context in which the problem is solved is not exactly the same as the one in which the solution is applied.

3 Single event attribution problems

The “attribution problem” as conventionally conceived concerns climate, and not weather. That is: what would the climate have been like had there been no human interference with the atmosphere? Providing answers to this question, typically in probabilistic terms, has been a prioritized matter in the scientific community at least since the Second Assessment Report of the Intergovernmental Panel on Climate Change (see Otto (2017)). The causal link between increasing concentrations of greenhouse gases in the atmosphere and changes in global temperature has successively become stronger and is now unequivocal (IPCC 2013). The harder challenge was always to attribute climate change to singular events, something which was widely believed to be impossible due to the background variability of the climate system and the paucity of high-quality data, until a method was proposed in the beginning of the 2000s (see below).

There are many reasons why singular event attribution is desirable and why it is a problem not to have good scientific answers to such questions. We have already mentioned the possibility of attributing legal liability for extreme climate events which has been an important motivation from the start. But beyond that, there are others too. For example, a more accurate general appreciation of costs of climate change could feed into the much-disputed damage functions underpinning many integrated assessment models (Frame et al. 2020). In the following, we distinguish between three kinds of shortcomings a process leading to a proposed answer to an attribution problem can have.

Since long statisticians distinguish between type I and type II errors. Type I errors consist in rejecting a true null hypothesis. For example, if the hypothesis that is to be tested is that the probability of extreme weather events under some given perturbation of atmospheric chemistry has increased, the standard null hypothesis is that no such relationship obtains. When a type I error is made, the relationship denied by the null hypothesis is mistakenly detected. In the context of climate attribution, one would have been overstating the role of climate change in causing the extreme event in question. A type II error consists in accepting a false null hypothesis. In the outlined situation, it would involve not detecting an increased probability of extreme weather events although that probability has in fact increased, i.e., understating the role of climate change in causing the extreme event.

The most common reason for type II errors is lack of data (the “power” of a study is the probability of rejecting the null hypothesis when it is false and is used for sample size calculations when planning a study). To these two, we add a version of what statisticians sometimes refer to as a type III error, i.e., to give a correct answer (i.e., no type I or type II error is involved) but to the wrong question. For example, it can happen that the causes behind the intertemporal variation of the occurrence of a disease are different from what causes variation within a population at a given point in time. When such differences in the causal backdrop are not carefully observed, there is a risk of type III error. In this setting, type III errors typically occur as a result of poor translation or problem/solution-feeding between one field and another (see, e.g., Wahlberg and Persson (2017) and Thorén et al. (2021)). If a type III error is detected, it can result in the conclusion that no relevant answer to the problem has been presented.

Attempts to tease out the various reasons why attribution is important have been discussed in, for instance, Hulme (2014). We note that that there are at least four reasons for attribution:

  1. 1.

    Attribution for science. Scientifically it is important to tease out the causalities, and, in this case, and from a presumed purely epistemic standpoint, it should not really matter if errors are of type I or type II. The aim is to reduce both types of errors as much as possible. Traditionally, type II errors have perhaps been of slighter concern to the scientific community (and more of a concern in risk management). Not acknowledging a causal relationship which exists until better evidence is produced slows science down in the short run but might have practical consequences for the decision-maker that are unacceptable. As far as standard setting goes, reducing one type of error comes at the expense of increasing the other type of error (i.e., accepting a causal claim which is in fact false). A climate scientist would say that the diagnostic power is expressed in its receiver operating characteristic (ROC) curve, where one has to decide on the preferred type I vs type II ratio. Attempted problem-feeding is a very common interdisciplinary mechanism, and to the extent that we aim for scientific answers to questions about climate change, human behavior, and legal responsibility, attribution for science also involves risks of error more similar to type III (for instance, that scientific inquiry has resulted in a correct answer about how to adapt when the question was about how to mitigate). However, depending on whether one is an optimist or pessimist about the future development of the interdisciplinary program, this might be conceived less or more problematic (Persson et al. 2018). The need for attention to problem–solution coordination is obvious in the following three reasons but exists also in the first. Hulme (2014, p. 501) adds that the veracity of climate simulation modeling in new ways “piques the scientific mind.”

  2. 2.

    Attribution for policy making. Here we can often lean on an old tradition in climate policy of no-regrets policies (Bulkeley 2001). The no-regrets policy implies that we should take the warnings about climate change seriously and start acting as if they are correct as long as the measures taken are beneficial for society, e.g., reducing air pollution and improving energy security. Provided that the null hypothesis is formulated in terms of there being no causal relation between climate change and an extreme weather event, a type I error would be acceptable as long as the resulting policy makes no/little harm.

  3. 3.

    Attribution for legal compensation. As already noted, the prospect of supporting legal action against, e.g., CO2 emitters or regulators with scientific knowledge has been figured as an important motivating factor behind developing single event attribution from the very start (Allen 2003). Legal evidentiary requirements depend on in what part of the legal system cases are made. In criminal court, it is typical to try to minimize type I errors by appealing to, e.g., the Blackstone’s ratio (“for the law holds that it is better that ten guilty persons escape than that one innocent suffer”) (Blackstone 1787). A problem here, of course, is that type II errors are correlated with data availability. In regions where high-quality data are available, these errors may be kept to a minimum at the same time as type I errors are controlled, but where data are scarce, the probability is high that false null hypotheses cannot be rejected. Minimizing type I errors can under such circumstances lead to a high share of type II errors (see also Hulme (2014)). Most climate litigation is in civil court, however, where the evidentiary requirements are typically considerably lower and appealing to notions such as “a preponderance of evidence.” Such requirements imply a different balance of type I and type II errors (Lloyd et al. 2021).

  4. 4.

    Attribution for future adaptation and mitigation decisions. There is evidence that strong belief in local effects of climate change is a prerequisite for adaptation decisions in some contexts involving individual decision-makers (see, e.g., Blennow et al. (2020)). Successful attribution is a driver of adaptation and mitigation. For the rational decision-maker, it should not really matter if errors are of type I or type II; the aim is to reduce both types as much as possible. There is reason to believe that type III errors are a considerable problem in this context. First, the statistical content of scientific conclusions is difficult to communicate to decision-makers (see, e.g., Hoffrage et al. (2000)). Second, the relatively sharp distinction between mitigation and adaptation in climate-related science cannot without risk be assumed to have a counterpart among decision-makers. In general, knowledge of particular cases might improve justification, planning, and execution of climate adaptation (Hulme 2014).

4 Available approaches and methods

Although the potential problems of processes aiming to solve single event attribution problems to some extent are general and independent of context of application (and consequently can be expressed in the statistical vocabulary of type I, II, and III errors), most of the substantive problems depend on specific features of the approaches taken and the quality of data that is in fact available. In DA two approaches have been at the center of the debate, The Standard Approach and The Storyline Approach, discussed below.

4.1 The standard approach

What has become the standard approach to EDA—often called the risk-based approach, the Oxford approach, or probabilistic event attribution—was developed in the early 2000s (Stott et al. 2003; Allen & Stott 2003). The contention before this point in time was that it was not meaningful to try to attribute climate change to individual events as it would not be possible to differentiate influences from natural variability from those of climate change (e.g., Allen 2003). What Stott and colleagues suggested was a model-based approach that relies on a comparison between the probability of an event E happening in the factual world (p1) and with the probability of an event happening in a counterfactual world (p0) with anthropogenic climate change removed.

That is to say, within the standard approach, the aim is to establish the relative difference in probability of some type of event, under which E sorts, happening under different conditions (with and without climate change). For instance, within the approach, one computes estimated probabilities and related diagnostics such as the Fraction of Attributable Risk (FAR = 1 – p0/p1) and the Risk Ratio (RR = p1/p0). Such comparisons of probability distributions are then subjected to statistical analysis to determine whether the event in question should be considered a consequence of climate change. It should be noted here that the move to probabilistic causality from deterministic versions, although preceded by similar moves in, for instance, epidemiology, was not uncontroversial among climate scientists. The difficulty in separating uncertainty and less than deterministic causal influence is also well-known from other fields. Further, the standard approach has its limitations and challenges. One of these is the consequence that a sufficiently small p0 results in FAR-values close to 1 irrespective of the efficacy of the causal factor studied. There are thus things to be said in favor of RR—which also has some challenges, including reasoning from types to tokens, an ubiquitous problem for probabilistic accounts of causation (see, e.g., Sober (1984) and Lusk (2017))—but that is how the standard approach can be used for single event attribution.

Essentially, as far as principles about how to extract causal knowledge are concerned, counterfactual reasoning is not a new idea. Counterfactual accounts of causation have been part of our causal understanding since the eighteenth century. There are at least three possible continuations of the basic idea, mirroring the old idea that causation has to do with relations of necessity and/or sufficiency between cause and effect (Hume 1748, sec). How this conceptual machinery can be applied to single event attribution problems is nicely laid out in Hannart et al. (2016). However, it should be noted that the discussion on the nature of causation that we see in this literature is rather narrow. Determinism/indeterminism is, of course, a fundamental ontological distinction, and whether causes are sufficient or necessary for their effects is also an interesting issue, but the discussion on what kind of causation one needs for mitigation, adaptation, and loss and damage issues will probably not be limited to these basic types.

Shepherd (2016) outlines three main steps in the standard approach:

  1. 1.

    Event definition

  2. 2.

    Reconstruction of factual probability distribution p1

  3. 3.

    Construction of counterfactual probability distribution p0

It should be noted that, although observational data often provide p1, typically both p1 and p0 are obtained by running (ensembles of) dynamic climate models—that is to say models that represent (among other things) the dynamic circulation in the atmosphere. p0 can also be estimated using observations by regressing the location parameter in a statistical model against a proxy for climate change (such as global mean temperature increase).

Each of these steps comes with a range of methodological and philosophical challenges and trade-offs that scientists need to deal with. For example, researchers need to balance the ability of the models to represent the phenomenon in question, which typically requires high-resolution models, with the need to run models over long periods of time, which requires computationally effective (mostly low-resolution) models. There are also issues with identifying and representing extreme events of the right type due to, e.g., constraints induced by model and data resolution. In general, models perform significantly better representing extreme temperatures than extreme precipitation (ref to IPCC AR5 WGI, Ch 9). Moreover, removing climate change from the models in order to obtain a counterfactual baseline scenario is no easy task either given the convention not to use coupled models but separate atmospheric elements from ocean thermodynamics (Shepherd 2016).

4.2 The standard approach’s performance in relation to type I, type II, and type III

The main concerns that have been raised with respect to the standard approach revolve around a handful of issues. First, there are worries about the reliability of the dynamical models used to determine relative frequencies. The other issue is more general and has to do with the use of frequentist statistics and conventional null hypotheses that assume no causal connection between climate change and the event in question (Lloyd and Oreskes 2019; Shepherd 2016; Trenberth 2011). Another and related issue has to do with the definition of an event (or a class of events). The need for a sufficiently large sample of events for probabilistic event attribution may lead to diluting or blurring the actual event at hand (van Oldenborgh et al. 2021).

The standard approach tries to establish causal links between climate change and individual weather events by statistical analysis of model runs. If the question that is the locus of the study concerns whether climate change was a cause of event E, the null hypothesis is that climate change did not cause the extreme event in question. Accepting or rejecting the null hypothesis is done on the basis of some degree of statistical significance; usually a p-value of 0.05 is used. The p-value is the probability of getting the observed value, or one that is more extreme, if the null hypothesis was correct. If this is less than the specified level, then the result is considered significant, and the null hypothesis is rejected.

Given these conventions, the onus is on the scientist to produce evidence that satisfies exacting standards in order to reject the null hypothesis. But argues those hesitant to the standard approach, we already know that climate change is in one way or the other a factor in every weather event. The alleged consequence of departing from this null hypothesis combined with evidentiary standards that are difficult to meet due to the paucity of the data is that “[a]s a whole the community is making too many Type II errors” (Trenberth 2011), i.e., the community fails to accept attribution hypotheses which are in fact true because the false null hypotheses are not rejected.

However, the choice of null hypothesis is not forced upon the advocate of the standard approach by logic, and it is not essential to hypothesis testing to assume as its null hypothesis that there is no causal connection between X and Y (nor that there is no causal connection between not-X and not-Y). It might be the case that the standard approach always selects as its null hypothesis that there is no causal connection between climate change and the hazard, but one does only have to consider biological hypothesis testing to find more variation. Often the null hypothesis that there is no causal connection is chosen; sometimes the currently accepted view is relied on as null hypothesis, and sometimes the hypothesis that facilitates computation might be selected.

Moreover, parts of the standard approach would survive the adoption of another statistical framework. A recent suggestion that has been made in the detection and attribution literature is to deploy a Bayesian framework for analyzing data rather than the conventional frequentist ditto. Mann and colleagues (Mann et al. 2017) have argued for such a shift in statistical paradigm and demonstrated how it can potentially improve the detection of a signal in the data by speeding that process up.

There is an obvious risk of type III errors when decision-makers try to apply results from the standard approach to real-world problems. The approach is ideally suited to produce information on type—or kind—level, not information about single events (Lusk 2017): Have events of kind K become more or less frequent as a result of climate change? Winsberg et al. (2020, 143) argue that its primary research question is “what is the probability of a specific class of weather event, given our world with global climate change, relative to a world without global climate change?”. The problem just described is one of many similar examples of how information expressed in statistical terminology (type/token, p-values, the implications of non-rejection of null hypotheses, etc.) often leads to misunderstandings of what this information more precisely can be used to provide answers to. In other words, it is a source of potential type III concerns.

4.3 The storyline approach

The main alternative to the standard approach is called the storyline approach and is of slightly more recent origin (Shepherd 2016; Trenberth 2011). In some of its early formulations, the qualitative aspects of the storyline approach were highlighted (see, e.g., Trenberth et al. (2015)). However, it is clear from other examples that the storyline approach is quantitative as well—just not probabilistic (see, e.g., Garderen et al. (2021)). Instead of focusing on the background conditions that determine the probability of some kind of event happening, the storyline approach assumes the event in question. Another way to put this difference is to say that the standard approach starts causally upstream of the event in focus—it considers the case for changes in the climate to have resulted in the event in question by predicting it in possible worlds with and without climate change—whereas the storyline approach takes as its starting point a concrete event that we know to have happened and tries to lay out the thermodynamical causal process or chain of which it is part. Certain dynamical aspects of the model are constrained. Since we understand the thermodynamic aspects of climate change and their relation to extreme events better compared with dynamic aspects, such as large-scale circulation patterns, this improves the signal-to-noise ratio of anthropogenic influence but makes it impossible to fully estimate the change in likelihood of the event (National Academies of Sciences, Engineering, and Medicine 2016, p. 39). More generally, storylines are physically self-consistent unfoldings of past events or of plausible future events or pathways (see also Shepherd et al. (2018) and Lloyd and Oreskes (2019)).

In relation to attribution problems, it is typically backwards looking, trying to connect an extreme event with causally upstream climate change. According to this approach—allegedly suitable for when the standard approach to climate change yields intolerable uncertainties—“a physical investigation of how the event unfolded, and how the different contributing factors might have been affected by known thermodynamic aspects of climate change” (Shepherd 2016). The approach has been likened to an autopsy (Lloyd and Oreskes 2019) where the aim is to determine the “best estimate of the contribution of climate change to the observed event” (Shepherd 2016) rather than refuting or accepting the null hypothesis that climate change was not involved.

In order to judge its merits, one must probe the specific features it is assumed to draw on in climate change attribution. Specifically, what causal traces could we hope to find in effects of climate change impacts that will not be found elsewhere? The storyline approach does not rely on comparative modeling runs using dynamic circulation models, which are associated with considerable uncertainties, but puts emphasis on well-established knowledge such as the relationship described by the Clausius–Clapeyron equation. According to this equation, the amount of water that the atmosphere can hold as vapor goes up exponentially with increasing temperature at a rate of about 7% per degree C. Loading more water in the atmosphere is an important meteorological factor, not least when it comes to precipitation. Such thermodynamic factors are central to the storyline approach when it attempts to answer questions about in what way climate change altered the impacts of, for instance, a hurricane.

4.4 The storyline approach’s performance in relation to type I, type II, and type III

In assuming that there is some kind of causal connection between human-induced climate change and the extreme weather event one is examining, it has been claimed that the storyline approach risks lead to the equivalent of too many type I errors:

By always finding a role for human-induced effects, attribution assessments that only consider thermodynamics could overstate the role of anthropogenic climate change, when its role may be small in comparison with that of natural variability, and do not say anything about how the risk of such events has changed. (Stott 2016, 33)

We say “equivalent of a type I-error” here because the storyline approach does neither typically formulate the null hypothesis that the event in question is not caused by climate change nor reject it. However, it does not seem clear that the approach must overstate the effect of climate change—it could as well underestimate it. Furthermore, storyline approaches can be rather specific about the roles of climate change and natural variability, respectively (see, e.g., van Garderen et al. (2021)).

It is a commonplace to contrast the two approaches in the following way:

  • The standard approach produces mainly type II errors

  • The storyline approach produces mainly type I errors

But this cannot be relied on more than as a heuristic. Importantly, the fact that the storyline approach does not really operate with the same hypotheses that the standard approach does (and of course that strictly speaking the standard approach does not answer attribution questions on token level) makes the contrast slightly misleading. The probabilistic approach takes a coarse-grained view; the storyline approach takes a more fine-grained view, at the cost of not addressing dynamical aspects of change.

Without in any way downplaying the importance of type I and type II errors, differences in the ways the storyline and standard approach formulate and solve problems motivate the probing of potential type III errors. What conclusion is it that the storyline approach delivers? It delivers answers to questions about whether and to what extent known thermodynamical aspects of climate change affect properties or features of the event (properties or features that makes it more extreme). It cannot do more since it takes the dynamical aspects of the situation as given, i.e., arising by chance (Trenberth et al. 2015), despite the fact that they too contribute within the extreme event and are somehow linked to climate change. Given this, a type III error is easily committed by those who want an answer to the question whether climate change caused the event in question. This is because “cause” means many things, and often something more complex than the storyline (and the standard) approach assumes. The storyline approach has difficulty with causes being sufficient for their effects, for instance. Furthermore—of course—causation is not only about thermodynamics. As long as we are interested in causal relations between events and understand these as particulars with more properties than one, causing entails affecting but not vice versa (see, e.g., Mellor (1995)). Therefore, depending on the causal discourse, attribution problems might need to be formulated in specific ways for the storyline approach to be able to shed light on them. Roughly, instead of being formulated in terms of being caused (or made more probable), they need to be framed in terms of being made more extreme (or similarly). This might have consequences for the usability of climate change information derived from the storyline approach in decision-making contexts. A perhaps surprising positive example is the storyline’s potential advantage of providing plausibility rather than probability for actionable climate risk scenarios regarding high-impact events in Sillmann et al. (2021). But there are other circumstances where the choice of approach matters. It matters because the type III risks differ. Sometimes the risks are greater if adopting the standard approach; sometimes the risks are greater if adopting the storyline approach. Having concluded this, it must be noted that the standard approach and the storyline approach can come to highly complementary conclusions and that they are versions on a continuum rather than completely different in nature (see, e.g., van Garderen et al. (2021)).

5 Topic incommensurability and the context of application

Winsberg et al. (2020) are clearly right on one point. To the extent that the two approaches originally were designed to handle different problems have different research agendas, they are in a certain sense not competing accounts. Hacking (1983) would have referred to them as “topic incommensurable.” Topic incommensurability is entailed by having different research agendas.

However, the importance and extent of this incommensurability seems to depend on whether the approaches intersect in some of their applications. The line of demarcation between the scientific approach on the one hand and its implementation (i.e., single event attribution problem-solving) on the other therefore becomes crucial in settling the issue. If the actual attribution takes place outside the approach (after the solution is fed), we can safely claim that the two approaches can be complementary—shed light on different aspects of a family of problems—without competing more than in an instrumental sense. But not if it takes place inside the approach, as would be the case if there is no solution-feeding until the attribution conclusion is reached—which would be the default mode if, in reality, the approaches are not primarily scientific research agendas but, for instance, designed for legislative or policy making purposes. This would matter greatly for the potential of the approach to play a role in the attribution for science-sense discussed above.

In any case, minimally it seems that, via solution-feeding to the single event attribution context, components of the approaches have been imported to facilitate solutions to the same problem. Thus, in their new context, they are not clearly topic incommensurable. Indeed, it seems that on occasion, the approaches do sometimes provide answers that have been perceived to be in tension with one another.

Whether the two approaches are complementary and competing accounts depend on the surrounding circumstances, and the import from various sources of uncertainty may make one choice rather than another suitable, but mainly it depends on the kinds of questions a presumptive decision-maker is interested in pursuing.

6 Progress and inductive risk

As Eric Winsberg and colleagues (2020) point out, an examination of the methodological debate that has been going on within EDA science will soon reveal patterns that are familiar to the philosopher of science. Concerns with so-called inductive risks are frequently raised as important when one method is to be preferred over another.

The argument from inductive risk was first formulated by Rudner (1953) and Hempel (1965), and it is supposed to illustrate how non-epistemic values are involved in making epistemic judgments under uncertainty.Footnote 2 In brief, the argument is structured as follows. No (interesting) hypothesis is ever verified with certainty, so a decision to accept or reject a hypothesis depends upon whether the evidence is sufficiently strong. But whether the evidence is sufficiently strong depends upon the consequences (including ethical consequences) of making a mistake in accepting or rejecting the hypothesis.

Thus, all else equal if the consequences are dire, and then stricter standards should be deployed than if the consequences are mild. When we are assessing, say, the presence of some toxic substance in food for toddlers, there is a reason to adopt standards that drastically curtails the tendency for false negatives (type II errors)—i.e., that fails to detect the substance when it is actually there—even if this comes at the expense of more false positives (type I errors). In other circumstances where the risks associated with false negatives are very low, or the overall certainty is very high, other standards can be adopted. The point is that these considerations, such as what should be considered a severe or unacceptable risk, are considerations of an ethical or maybe social nature. But, as the argument seems to indicate, they are involved in the most epistemic of activities: accepting and rejecting hypotheses.

That reference to the consequences of making mistakes plays an important role in today’s DA debate is obvious:

“An individual might miss out on a high-profile paper, but that would be a small price compared to the reputational harm of claiming a positive result that subsequently turns out to be false.” (Allen 2011, p. 931)

“… given the observed slow response of civil society to act to prevent those harms, it seems reasonable to conclude that the risks of underreaction to climate change are now greater than [sic] the risks of overreaction. We suggest that in such a situation, it is ethically preferable to embrace an approach that avoids understating what we know.” (Mann et al. 2017, 140f)

According to Rudner’s argument, the relevant concerns that influence the kinds of methodological choices that are facing scientists in the field of EDA science are (and should be) informed by the broader societal risks run by making decisions on the basis of erroneous claims. Accordingly, overestimating the impacts and consequences of climate change leads to misguided adaptation and mitigation efforts and possibly other risks as well (Stott et al. 2016). Without any further qualification, this is clearly undesirable. Underestimating the impacts and consequences of climate change, on the other hand, potentially leads to inadequate mitigation and adaptation efforts. That in turn may have downstream effects that are nothing short of catastrophic. Framed in this way, weighing these kinds of risks against one another may appear straight-forward. The risk of sub-optimally distributing resources is balanced against severely curtailing the carrying capacity of the planet and irreversible ecological destruction.

It thus seems perfectly clear that descriptively there is no such thing as a value free science, at least not in the case of DA. But the important question in this context is normative. Our brief illustration of what attribution “is for” clearly shows that we for normative reasons would benefit from finding a way of taking different needs in the four dimensions we identified (and possibly others that we have not thought of) into account without compromising future use of DA in any of the four dimensions.

What we would ideally like to have in place here, as in many other cases where problem-feeding is involved, is a modified version of Jeffrey’s (1956) proposed solution to Rudner’s challenge. Jeffrey’s strategy is to challenge one of Rudner’s premises, namely that it is the job of the scientist qua scientist to accept and reject hypotheses. Rather, Jeffrey suggests that the role of the scientist is merely to report on the probability (given the evidence) of various hypotheses being true. This should leave inductive risk considerations to the decision-maker, and science can live up to its value free ideal. Unfortunately, we know that this is impossible. There are several reasons for this (see, e.g., Parker and Winsberg (2018)). One rather fundamental such reason is that scientists need to make a number of decisions themselves in order to facilitate scientific progress and avoid a kind of Duhemian Cul de Sac.Footnote 3 As Popper points out, the foundation upon which scientific knowledge is erected is not absolutely firm. He writes:

Science does not rest upon solid bedrock. The bold structure of its theories rises, as it were, above a swamp. It is like a building erected on piles. The piles are driven down from above into the swamp, but not down to any natural or ‘given’ base; and if we stop driving the piles deeper, it is not because we have reached firm ground. We simply stop when we are satisfied that the piles are firm enough to carry the structure, at least for the time being. (Popper 1959, 94)

Since there can never be absolute certainty about anything and science must progress inductive risk considerations gets built into the very foundations of science. This is of course not to say that all value influences on science are permissible. There are two issues that arise when value influences are to be assessed. First, they must be of the right kind (see, e.g., Douglas (2009)). Then, given that they are of the right kind, they also have to be the right values or somehow possible to adjust to the right values. What are the “right” values? It could be those that are morally defensible, for example, the values upheld by some relevant group or individual (see, e.g., Intemann (2015)) and those that are exposed to whatever the relevant risks are (see, e.g., Shepherd (2019) and Parker and Lusk (2019)).

7 Problem-feeding pluralism

The pluralism that Winsberg and colleagues see has two faces. One concerns questions and answers. The two approaches can be disambiguated in accordance with what Lloyd (2015) calls the logic of research questions. As noted above, the two approaches are thus topic incommensurable and do not come into conflict with one another. The other face of this pluralism has to do with different ideas about the prospects of a value free science. Roughly speaking, proponents of the standard approach align themselves with Jeffrey, whereas the defenders of the storyline approach are Rudnerians. In other words, the real disagreement is about the role of values in science and how value influences are best managed.

Hence, Winsberg et al. distinguish between first-order inductive risk and second-order inductive risk. First-order inductive risk revolves around familiar concerns regarding the relative weight of different kinds of errors. What is worse, underplaying the consequences of climate change or overplaying them? Second-order inductive risk has to do with the underlying philosophical issue; how should value influences on science be construed to begin with? And what are the implications for scientific practice in fields with considerable practical and political relevance?

Above we have structured this situation as a problem-feeding situation. That means that we differentiate between the context in which the problem arises (or the question is asked) and the context where that problem is solved. In short, an important consideration in EDA and a major motivation for engaging in this kind of research is (minimally) the perception that it is socially relevant in various ways (see our list above). That is to say, there is a “social” problem of some sort that scientists think themselves able to solve or at least contribute to the solution of. If this perception is accurate, a crucial feature of the problem situation has to do with the context in which the problem arises and whatever the demands and requirements are in that context. This includes crucially what counts as an admissible solution to the problem at hand in the setting in which the problem arose.

What counts as a solution to a problem is in most scientific disciplines as well as outside of science to some degree a matter of convention (Duhem 1954; Kuhn 1993; Nickles 1981). Many different considerations determine what amounts to having solved a problem in a particular context, and the diversity is considerable; what is considered an admissible solution to a given problem in mathematical biology is different from, say, what solutions look like in archeology or sustainability science. A complex set of values, both of epistemic and non-epistemic kind, presumably guide these considerations. It would appear that such considerations precede conventional inductive risk considerations which mainly have to do with where to place the threshold that governs whether hypotheses are accepted or rejected. If the solution fails to gain any traction in the context of application that particular issue does not even arise. A prerequisite for conventional inductive risks to come into play depends on the overall admissibility of the solution; once a solution is deemed to be of the correct type, its quality can be assessed. This indicates that type III errors are more fundamental than type I/type II; type III errors precede the other types of error (at least narrowly construed) although very similar value considerations arise.

With this shift in focus, the locus falls on the context of application and the demands and requirements that govern that context rather than the intentions of problem solvers and logic that govern how research questions are formulated in the problem-solving context.

8 Concluding remarks

We have shown how problem-feeding and solution-feeding challenges may arise in single event attribution studies (particularly in the context of application). Recent discussion on the issues that surround EDA science has revolved around the relative importance of avoiding type I and type II errors, respectively—or more concretely about the risks involved in overstating and understating the consequences and costs of climate change. Instead, we have highlighted so-called type III errors when single event attribution is deployed in various practical contexts, such as adaption planning and management or in legal liability. Broadly speaking they depend on a gap between what the methodologies provide in strict terms and the required solutions. That gap must be closed in one way or the other. Leaving it open makes for opportunities for problem–solution misalignment.

The standard approach most notably is afflicted by type-token concerns. It produces probabilistic answers at the type level and not strictly speaking token level attribution. The latter would have to be inferred from the former, which is at best tenuous. The storyline approach, on the other hand, makes for other kinds of type III concerns. Most notably it does not straightforwardly answer the question “Was extreme event E caused by climate change?” or any of its close relatives. Indeed, it could not and would not try to answer it. It takes as its starting point that climate change has changed everything, to some degree. For all Es, there are aspects of E that would not have occurred without climate change. The only relevant question, according to it, is how climate change has changed things. It provides information about how aspects of E were affected. It does this in a conditional way, by suitable causal assumptions. By adopting a retrospective approach, it runs into other type III problems as well. It becomes difficult to put to forward-looking, risk-based use.

That gaps like this exist is no reason to discard the methodologies wholesale; such gaps are commonplace, and it is a rare thing that scientific solutions couple hermetically and without residue to non-scientific problems. It does however raise practical concerns with respect to how the gap should be closed and what actor or actors are appropriately positioned in order to balance epistemic values and aims with ethical and social concerns.

Second, and more specifically, the potential complementarity of the two approaches further depends on whether and where this gap is closed relative to the context of application. If both methods are taken to respond to regular causal questions, then they can come into tension with one another, as they sometimes have.

However, whenever attribution science is invoked in for instance a legal case of liability, a combination of the two approaches is usually necessary to build a strong enough case by providing evidence for different premises needed in the argument. For instance, a probabilistic approach may be used to consider alternative aspects of climate change and find out if anthropogenic climate change is the main driver of change in hazard. If so, then the storyline approach will be important for quantifying impacts and harm (Lloyd & Shepherd 2021).