1 Introduction: Evaluative and Deontic Propositions

The topic of this chapter are arguments with normative conclusions, as illustrated by instances of argumentation that have become prevalent during the COVID-19 pandemic. In particular, we aim to explore the inferential connection between evaluative and deontic concepts,Footnote 1 more specifically, between ‘good’ and ‘ought’.

Note, to start, that contrary to Hume’s famous dictum, it does not seem difficult to derive an ought from an is:

  1. (1)
    1. a.

      Wearing a face mask in public is good.

    2. b.

      Ergo, you ought to wear a face mask in public.

This argument looks perfectly sound; and yet, there is no ‘ought’ in the premise. One would immediately respond, however, that ‘is good’ is an evaluative expression, and hence that this is not a counterexample to the claim that one cannot derive an ‘ought’ from an ‘is’, since evaluative and deontic expressions belong to the same domain of broadly normative expressions. But this is our point of departure: while evaluative and deontic expressions are often subsumed under the same broad class, there are important, often overlooked, differences between them.Footnote 2 Nevertheless, whether evaluatives and deontics also differ in their argumentative properties and roles, and if so, how, are questions that, to our knowledge, have never been properly explored.

1.1 Similarities and Differences Between Deontic and Evaluative Language and Concepts

There are three possible approaches to adopt regarding the relation between the deontic and the evaluative realm:

  1. A1:

    The differences between evaluatives and deontics that exist in other domains have no impact on argumentation. Evaluatives and deontics are, from an argumentative point of view, pretty much interchangeable.

  2. A2:

    While evaluatives and deontics exhibit different argumentative patterns, there are interesting logical connections between the two.

  3. A3:

    There is no special or interesting logical relationship between evaluatives and deontics.

A1 and A3 are two extreme positions; we present them here as mere conceptual possibilities. A1 makes two predictions: first, appropriately similar evaluative and deontic judgments should be overall interchangeable in argumentation. For example, every judgment about what is good should be substitutable for a judgment about what ought to be the case or what one ought to do, and vice versa, without affecting the overall validity of an argument. Secondly, it should be possible to derive ‘ought’ from ‘good’ and ‘good’ from ‘ought’. Insofar as Moore (1903) aimed to reduce right to good (by treating the former as denoting instrumental value), he may be seen as a defender of A1; similarly, Hare’s (1952) prescriptivism may be seen as an attempt to account for evaluative and deontic expressions (chiefly good, right and ought) as particular species of prescriptive language.Footnote 3 In stark contrast, A3 rejects the existence of any inferential or logical connection whatsoever between evaluatives and deontics. To our knowledge, A3 has never been defended or even endorsed, but we are including it as part of the conceptual space.

It is more common to find endorsement of (some version of) A2 in the literature. A2 is a weaker thesis than A1, since it is not committed to the first prediction above: if evaluatives and deontics exhibit different argumentative patterns, we should not expect these expressions to be interchangeable salva validitate. Moreover, A2 does not imply that one can derive ought from good or the other way around. However, A2 is compatible with the existence of some inferential connections between deontic and evaluative expressions (in either direction), contrary to A3.

Indeed, it is intuitive to think that something akin to A2—or perhaps A1—lurks beneath the usual reaction against the purported counterexample to the is-ought gap above. ‘Good’ and ‘ought’ belong to the same broad class of normative expressions, so the impression that in (1) one has jumped from the descriptive to the normative realm is an illusion—the premise was already normative. In the remainder of this section, we consider the motivation for A2 by considering similarities and differences between evaluative and deontic expressions and concepts.

Consider first the similarities between the two families of concepts. Broadly speaking, there are reasons to think that deontics and evaluatives belong to the same conceptual realm. This is supported by the observation that both classes of expressions seem to admit the same contextual qualifications and parameters. Specifically, they show a clear dependence on different “flavors” of evaluation, they are circumstance-dependent in broadly the same ways, and they are action-guiding. Let us review these characteristics in turn.

First, we speak of categorical and hypothetical value; similarly, we can think of categorical and hypothetical obligations or duties.Footnote 4 Categorical values and obligations concern invariable and universal ends such as attaining well-being; instrumental values and obligations are those that subserve means to particular, variable ends. Linguistically, judgments of categorical value/obligation are most often expressed with unqualified uses of the relevant evaluative/deontic expressions; while judgements of hypothetical value/obligation introduce qualifications such as ‘in order to φ’. To see that introducing this qualification can affect the truth conditions of a value or deontic judgment, note the contrast between (2) and (3):

  1. (2)
    1. a.

      ✔ You ought to visit your family.

    2. b.

      ✔ Visiting your family is good.

  2. (3)
    1. a.

      ╳ You ought to visit your family in order to prevent the spread of COVID-19.

    2. b.

      ╳ Visiting your family is good in order to prevent the spread of COVID-19.

The sentences in (3) are intuitively true, but when they are modified by “in order to…”, as in (3), they can become false. For we may agree that families involve (more or less) categorical values and obligations, but these can be overridden by a hypothetical goal, such as that of preventing the spread of COVID-19.

Secondly, values and obligations are circumstance-dependent: wearing a mask might be a bad idea for people with certain respiratory conditions; and whether you ought to stay at home, for instance, depends on whether you are safe there.

Finally, values and obligations are action-guiding: ascribing value to something invites the thought that one is motivated to orient one’s actions towards it; the same goes for judging that one ought to do something.Footnote 5 Relatedly, influential accounts of normativity in metaethics, such as buck-passing (Rowland, 2019; Scanlon, 1998) or expressivist accounts (Ayer, 1946; Gibbard, 1990, 2003; Stevenson, 1937) have been applied to account for both evaluative and deontic expressions. Often, authors in these metaethical traditions speak indistinctly of deontic and evaluative normativity. These considerations invite the conclusion that, at an important level of description, the deontic and the evaluative belong to the same realm. It is thus suggestive to expect that, when deontic and evaluative concepts appear in argumentation, one can in some instances move from one class of concepts to the other, and vice versa.

However, there are important differences between both classes of concepts as well. Some of these differences were highlighted by classical ethicists. Hurka (2014) discusses Sidgwick, Moore and Ross as authors all of whom thought that there existed interesting inferential connections between good and ought, even though there exist important points of contrast between the domain of the evaluative and the deontic. For example, in The Methods of Ethics (1874/2019) Sidgwick calls ‘right’ an imperative concept, while ‘good’ is merely attractive. Another important difference between ‘ought’ and ‘good’, highlighted by Sidgwick, Ross or Prichard is the well-known ought implies can principle, stemming from Kant: if someone ought to do something, then they can do it. But from the fact that an action or state of affairs is good, it does not follow that it can be brought about. Some other authors, like Russell (1910/2009), took ‘good’ to be the widest normative term, while ‘ought’ applies only to voluntary acts. Relatedly, ‘good’, at least on its surface, denotes a simple property of objects.Footnote 6 ‘Ought’, on the other hand, denotes a relation between an agent and whatever they are obliged to do: obligations are imposed upon specific individuals. Moreover, simple evaluatives such as ‘good’ and ‘bad’ take a wide range of arguments: individuals, objects, actions or states of affairs can be ‘good’.Footnote 7

  1. (4)
    1. a.

      Jacinda Ardern is a good politician.

    2. b.

      These masks are bad.

    3. c.

      Washing your hands is good.

    4. d.

      It’s good that the WHO issued new policies promoting physical exercise.

By contrast, ‘ought’ can be applied only to actions and states of affairs.

Finally, another point of contrast between evaluative and deontic expressions concerns their gradability: in general, evaluative terms are gradable, while deontic terms are not (Stojanovic, 2017; Tappolet, 2013). In other words, values come in degrees, while obligations are ‘on/off’; either they are in place or not at all. The gradability of evaluative expressions is attested by the admission of adjectival modifiers, such as ‘very’, ‘slightly’ or the comparative form. Evaluatives admit them clearly, while deontics sound odd with them (marked with #).

  1. (5)
    1. a.

      Wearing a mask in public is {very / slightly} good.

    2. b.

      Wearing a mask in public is better than washing your hands.

  2. (6)
    1. a.

      # Wearing a mask in public is {very / slightly} forbidden

    2. b.

      # Wearing a mask indoors is more forbidden than wearing a mask outdoors.

This implies that to call something ‘good’ is not simply to ascribe to it a discrete property that objects either have or lack (such as, e.g., being dead or being married); it involves ascribing a scalar property to a certain degree, and thus implies a tacit reference to a contextually determined threshold or standard (see Cresswell, 1976; Kennedy, 2007; Kennedy & McNally, 2005 for classic references). By contrast, to ascribe an obligation or a prohibition is to attribute a property that something either has or lacks; there are no degrees of prohibition [obligation, permission], something is either forbidden [obligatory, permitted] or not.Footnote 8

These observations allow for the possibility that there are interesting logical, inferential or conceptual connections between deontic and evaluative concepts, even if they are ultimately different concepts.

1.2 The Inferential Connection Between ‘Good’ and ‘Ought’

For the purpose of the present chapter, we will limit our attention to the relationship between ‘good’ and ‘ought’, and, more specifically, between propositions of the schema ‘φ-ing is good’ and ‘you ought to φ’ (where φ stands for an action-type or a policy). The three approaches above (A1-A3) can be put into correspondence with the following hypotheses about the argumentative relation between ‘good’ and ‘ought’:

  1. H1:

    φ-ing is good’ and ‘you ought to φ’ are logically equivalent and interchangeable in argumentation.

  2. H2:

    φ-ing is good’ and ‘you ought to φ’ are logically asymmetric; either ‘φ-ing is good’ implies ‘you ought to φ’ (but not the other way round) or ‘you ought to φ’ implies ‘φ-ing is good’ (but not the other way round).

  3. H3:

    φ-ing is good’ and ‘you ought to φ’ are logically independent.

Again, H1 and H3 have to our knowledge not been explicitly defended, but we are including them for reference. Before we move on, we should point out that H2 is more specific than A2; H2 concerns the relation of logical implication between ‘good’ and ‘ought’, whereas A2 simply acknowledged the existence of logical connections between the evaluative and the deontic domain. These might be different from the relation of logical implication described in H2, and they could involve other evaluative and deontic concepts (‘bad’ and ‘forbidden’, for example).

We should also mention that, even though, to our knowledge, the inferential connection between ‘good’ and ‘ought’ has not been explored, the philosophical and linguistic literature contains well-known definitions of ought in terms of scalar goodness, that is, in terms of betterness. For example, Lewis (1973, Chap. 5) holds that ‘ought φ’ is true iff φ is better than ~φ. Along similar lines, Sloman (1970)’s definition is that ‘ought φ’ is true iff φ is the best alternative (where the alternatives to φ are contextually determined; see also Wedgwood, 2017). More recently, Lassiter (2017, Chap. 8) takes issue with such definitions. His argument against Lewis and Sloman’s view is, roughly, that it makes bad predictions for supererogatory acts. Supererogatory acts are acts that, while good, lie beyond the call of duty. For instance, suppose that your friend is ill and needs your help to clean her apartment. Given this, it’s true that you ought to help your friend. It would be even better, however, if you helped your friend and also baked her a cake. However, baking a cake is not something that you ought to do—what you ought to do is help her. Baking is optional. Lassiter argues convincingly that both Sloman and Lewis’s definitions of ‘ought’ wrongly predict that, if helping your friend and baking the cake is better than helping and not baking, then helping and baking is also what you ought to do.

To prevent this, Lassiter settles for a weakened (that is, unidirectional) version of Sloman’s definition, which he calls Sloman’s Principle:

Sloman’s Principle: if ought φ, then φ is the best among its alternatives.

Thanks to its unidirectionality, this principle allows for the possibility of best alternatives that are not part of one’s duties, thus allowing for supererogation. Interestingly however, Lassiter does not discuss any connections between ought and good, only between ought and better.

The hypotheses above make predictions about the way in which sentences containing ‘good’ and ‘ought’ are expected to behave in argumentation. H1 predicts that speakers will have no problem going from premises of the form ‘φ-ing is good’ to ‘you ought to φ’ as well as in the reverse direction. Consequently, they are expected to accept any argument that takes them from ‘good’ to ‘ought’ or from ‘ought’ to ‘good’. H2 predicts that speakers will only accept one of those directions: either they will accept as valid arguments that take them from ‘φ-ing is good’ to ‘you ought to φ’ and reject arguments that take them from ‘you ought to φ’ to ‘φ-ing is good’, or the other way around. Finally, H3 predicts that speakers are expected to reject both directions throughout.

1.3 Assessing the Hypotheses Empirically

In order to test these hypotheses, and thereby the plausibility of the different approaches to the place of evaluatives and deontics within a broader theory of normativity, we have run an Inferential Judgment Experiment, where we instructed participants to decide whether an argument involving deontic and evaluative expressions made sense or not.

As experimental items, we have focused on examples concerning the current COVID-19 pandemic. This choice was motivated by the observation that the global sanitary crisis to which the world has been exposed has given rise to a tremendous amount of argumentation, at the public, private and scientific level. The need for new measures, such as imposing lockdowns, curfews, and introducing various other means of prevention, be they strict requirements or mere recommendations, has been accompanied by a need of justifying such new policies, and those, in turn, typically take the form of an argument that largely uses evaluative and deontic language. While such arguments can often be complex, at least from a linguistic point of view, we have opted for a simplified version. Our experimental study deployed as stimuli arguments such as the following:

  1. (7)
    1. a.

      You ought to wear a mask in public to prevent the spread of COVID-19. Therefore, wearing a mask in public is good to prevent the spread of COVID-19.

    2. b.

      Wearing a mask in public is good to prevent the spread of COVID-19. Therefore, you ought to wear a mask in public to prevent the spread of COVID-19.

Note that, when participants are asked to assess such arguments, they will typically have a personal opinion regarding their premises and conclusions. Who hasn’t wondered whether it is really such a good idea to wear a mask on the street, or while jogging outdoors? Who hasn’t asked themselves whether they ought to wear gloves when shopping? Who hasn’t reflected upon the policies and guidelines that were designed and implemented in the face of global, unprecedented uncertainty about what one ought to do? The COVID-19 pandemic, and the measures against it, are topics that have spurred the most practical concerns, deliberations and decisions over the last two years.

However, and precisely because of this, the participants in our study may easily let their first-order opinions about the actions and policies under consideration override their judgment about the logical or argumentative connection between ‘ought’ and ‘good’. For instance, participants may be inclined to assent to (6a) or (6b) simply because they assent to the idea that wearing masks in public is good and/or something that they ought to do. This is why, in addition to instances of argumentation involving specific policies and actions, we have included in our study stimuli over which the participants cannot be opinionated, because they lack sufficient information in order to form an opinion of their own (for a similar approach, see Schumann et al., 2021):

  1. (8)
    1. a.

      Following the safety procedure is good to prevent the spread of COVID-19. Therefore, you ought to follow the safety procedure to prevent the spread of COVID-19.

    2. b.

      You ought to follow the safety procedure to prevent the spread of COVID-19. Therefore, following the safety procedure is good to prevent the spread of COVID-19.

Even if examples such as (7a) and (7b) are still purportedly about the COVID-19 pandemic, they resemble more the kind of stimuli preferably used in experiments of the sort, which aim to avoid eliciting any personal involvement from the participants. As we shall shortly see, having stimuli of both types in our study permits us to see how this personal interference impacts speakers’ judgments about the arguments involving ‘good’ and ‘ought’.

2 An Experimental Study of ‘Good’ and ‘Ought’ in Argumentation

In order to assess the logical relation between the deontic ‘ought’ and the evaluative ‘good’, we used an Inferential Judgment Task method, in which participants have to decide whether one can justifiably draw a certain conclusion from a given premise (see Hansen & Chemla, 2017 for a similar methodology). In this case, participants were asked to decide whether a series of arguments involving COVID-19 measures ‘make sense’ or not, and in some cases, to provide a short justification.Footnote 9 An example of a trial is given in Fig. 3.1.

Fig. 3.1
figure 1

Illustration of a critical trial in the ought > good directionality condition

Participant’s responses are taken to reflect inference patterns. We assume that if a participant considers that a certain argument ‘makes sense’, it is because they can infer the conclusion from the premise. We further assume that if two arguments are not supported to the same extent, then they must underlie different inferential processes, that is, they cannot be logically equivalent. We then test (a) whether participants support arguments involving ‘good’ and ‘ought’ (e.g., 6), and (b) whether they support (6a) and (6b) to the same extent.Footnote 10

2.1 Participants

43 English-speaking participants were recruited online using Prolific and were paid 0.8£ for their participation, which lasted approximately 5 min. Participants who had not answered correctly more than 2/3 of each type of control trial (see below) were excluded from the analysis. This led to the exclusion of 10 participants.

2.2 Design and Materials

Arguments were of the form [p. Therefore, q]. For simplicity, we represent these arguments as p > q, where p is the premise and q the conclusion. In critical trials, arguments involved the propositions ‘φ-ing is good’ and ‘you ought to φ’. We manipulated which of these propositions was the premise (p) and which one was the conclusion (q) (henceforth, the Directionality factor). There were two possible Directionalities: good > ought or ought > good (henceforth G > O and O > G).

Arguments also varied in their content (φ). There were 12 different contents, including both specific actions and non-specific actions (see Table 3.1). Specific actions were action-types that participants could be antecedently opinionated about (e.g., washing your hands), whereas non-specific actions were policies or guidelines whose content was unknown to participants (e.g., following the safety protocol), and thus could not be opinionated about. We thus obtained 24 directionality/content combinations. Each participant was presented with a subset of 12 critical trials such that there were always 6 trials per Directionality controlled for specificity of the action, but without repeated content. All propositions included the phrase ‘to prevent the spread of COVID-19’ in order to ensure that the statements were read instrumentally (i.e. as specific pandemic-related measures) and not categorically.

Table 3.1 Possible contents in critical items

To ensure that the participants were sensitive to classical inference patterns, we additionally included control trials. Controls featured either valid entailments such as (8a) (upward inferences) or invalid entailments such as (8b) (downward inferences). Participants were presented with six controls of each type, randomly selected from a list of 24 statements. Control performance served as our exclusion criterion.

  1. (9)
    1. a.

      Cases are on the rise in every European country. Therefore, cases are on the rise in France.

    2. b.

      Cases are on the rise in France. Therefore, cases are on the rise in every European country.

2.3 Control Truth-Value Judgment Task

Responses in an inferential task like the one presented here might be influenced by participants’ first-order opinions about the truth of the premises and/or conclusions. For example, participants might be more likely to accept an argument whose premise they believe to be true than one with a false premise, regardless of the structure of the argument. This may be further boosted by the specific nature of the content used in our experiment, which regards a topic of public debate, namely, the COVID-19 pandemic and the measures undertaken against it. In other words, if the contents of our stimuli were purely abstract actions (e.g., ‘being kind’ or ‘arriving late’), participants might find it easier to abstract away from their first order judgments about the premises and/or conclusion and assess the relevant inferences on their own merits.

In order to evaluate whether the truth of the premises had an impact on the inference patterns, we included a control truth-value judgment task (TVJT) at the end of the experiment. Participants were told they were going to see a series of statements and they had to decide whether the statements were true, false or whether they could not tell (i.e., ‘I am not sure’ option). The statements that participants were presented with were the same premises of the twelve critical arguments that they had seen during the inferential task. Participants’ responses during this TVJT would then be used to estimate the influence of perceived truth-value on inference acceptability.

2.4 Procedure

Participants were directed to a web-based Inferential Judgment experiment, implemented using JavaScript. Participants were told they were going to see statements about the pandemic, and they had to decide whether these statements made sense or not. Control and critical items were then presented in a random order. In the second part of the experiment, participants were introduced to the Control Truth-Value Judgment task. They were told they were going to see similar statements to the ones they had seen before, but now they had to decide whether the statements were true, false or whether they could not tell (i.e., ‘I am not sure’ option).

All materials, including instructions and both critical and control items, can be found in the following OSF repository.Footnote 11

3 Results

Figure 3.2 shows the mean proportion of ‘makes sense’ responses for critical trials as a function of the Directionality condition. Figure 3.3 breaks this down depending on the participants’ truth value judgments during the second part of the experiment (TVJT). That is, as a function of whether participants perceived the premises to be true, false or whether they weren’t sure.

Fig. 3.2
figure 2

Proportion of ‘makes sense’ responses in the Inferential Task as a function of the Directionality condition. Error bars represent standard error on by-participant means; dots represent individual participant means

Fig. 3.3
figure 3

Proportion of ‘makes sense’ responses in the inferential task as a function of the directionality condition and the truth value judgment assigned during the TVJT. Error bars represent standard error on by-participant means; dots represent individual participant means

We ran a logistic mixed effect model to evaluate the effect of Directionality (G > O; O > G), of Truth Value Judgment (true, false, neither) and of the Directionality:Truth Value interaction on participants’ responses. Directionality and Truth Value Judgments were sum-coded. Responses were coded as a binary variable (1 if ‘makes sense’; 0 otherwise). We included by-Participant random intercepts and by-Directionality slopes. P-values were obtained on asymptotic Wald tests, and the standard alpha level of 0.05 was used to determine significance. All analyses were carried out in R (R Core Team, 2019), using the lme4 package (Bates et al., 2014). Data and scripts for the analyses can be found in the OSF repository (see Footnote 11).

The results of the model are reported in Table 3.2. The intercept value shows the grand mean of ‘makes sense’ responses across participants and conditions (i.e., for both Directionalities, and for all three Truth Values). The fact that this intercept is positive and significant reveals that the log-odds of ‘makes sense’ responses is significantly higher than what one would expect by chance. That is, participants tend to accept the critical arguments, that is, arguments involving ‘good’ and ‘ought’ such as (6a, b) above, regardless of their Directionality or Truth Value Judgment.

Table 3.2 Model output

The Directionality value shows the difference between responses for O > G arguments and the grand mean. The effect of Directionality is negative and significant, revealing that the proportion of ‘makes sense’ responses was significantly lower in the O > G directionality than in the G > O one. The model also reveals a significant effect of Truth Value Judgments, suggesting that the truth value that participants assign to the premise of an argument has an impact on whether they accept the argument.

Finally, our model also assesses the interaction between Directionality and Truth Value Judgment (i.e., whether participants’ sensitivity to Directionality depends on the truth value assigned to their premises), but this was only marginally significant.

A second post-hoc analysis was then run to determine whether the effect of Directionality interacted with participants’ certainty with respect to the truth of the premises; that is, with the difference between assigning a proper truth value (‘true’ or ‘false’ judgments) and being uncertain (‘I am not sure’ judgments). This second model specifically tested an interaction between Directionality and Certainty in participants’ inferences. In this case, the results reveal a significant Directionality:Certainty interaction (p < 0.01), indicating that the effect of Directionality is stronger for those arguments about whose premises the participants were uncertain.

4 Discussion

Our results show that, overall, participants were sensitive to the G > O/O > G asymmetry in the following direction: arguments showcasing the G > O direction of inference were accepted to a higher degree than arguments exemplifying the O > G pattern. However, these results must be taken with a grain of salt. First, it is not the case that participants rejected the O > G pattern of inference. The results are above 50% acceptance in both directions. This means that, overall, inferences in both directions were accepted by most participants. Secondly, the main effect of Directionality was largely driven by arguments about whose premises the participants were unsure. As shown in Fig. 3.3, the starkest contrast in the acceptability of G > O/O > G inferences is for arguments such that when asked about their truth value in the control TVJT, the participants replied ‘I am not sure’.Footnote 12 By contrast, the items that were classified as true were such that participants generally accepted both directions of inference (in both cases, results are at ceiling), while for items that the participants classified as false, both directions were around 50%.

This might suggest, we think, that participants have their inferential capacities overridden, or at least biased, by their first-order judgment about the premise (recall that the control TVJT asked participants about the truth of the premise—not the conclusion—of the items they had previously seen). In other words, if someone thinks that washing one’s hands is good to prevent COVID-19, they’re likely to accept inferences of the form: ‘Washing your hands is good to prevent the spread of COVID-19. Therefore, you ought to wash your hands to prevent the spread of COVID-19’. And similarly, if someone agrees that one ought to wash one’s hands to prevent the spread of COVID-19, then they are likely to assent to an inference of the form: ‘You ought to wash your hands to prevent the spread of COVID-19. Therefore, washing your hands is good to prevent the spread of COVID-19’.

On the other hand, for premises about whose truth participants were unsure, participants were significantly more sensitive to the difference between the G > O and O > G directions. It is suggestive to think that, in these cases, participants paid less attention to the content of the premises/conclusion than to the form of the argument. In those cases, participants agreed more often with the G > O than with the O > G direction.

We conclude that, while both directions were found to be acceptable, our study provides very strong evidence that the inference from ‘good’ to ‘ought’ is stronger than the inference from ‘ought’ to ‘good’. At a very minimum, we take these results to speak against hypothesis H3, namely, the hypothesis that ‘φ-ing is good’ and ‘you ought to φ’ are logically independent. The fact that participants accepted both directions of entailment suggests that such schemas are not independent.

That leaves us with H1 and H2. H1 is the hypothesis that ‘φ-ing is good’ and ‘you ought to φ’ are logically equivalent, and thus interchangeable; H2, that ‘φ-ing is good’ and ‘you ought to φ’ are logically asymmetric, that is, either ‘φ-ing is good’ entails ‘you ought to φ’ but not the other way around, or ‘you ought to φ’ entails ‘φ-ing is good’ but not the other way around. Our results, however, are in tension with both hypotheses.

On the one hand, the fact that there exists a significant asymmetry in the degree to which participants accept G > O and O > G inferences suggests that ‘φ-ing is good’ and ‘you ought to φ’ are not logically equivalent. If they were, we would not expect to see a main effect of Directionality. This speaks against H1.

On the other hand, while our results reveal an asymmetry between G > O and O > G inferences, they are not compatible with the view that O > G arguments are invalid. Both G > O and O > G inferences were found to be overall acceptable (significantly above 50%). This contrasts with our control results, where participants generally rejected invalid entailments (e.g., ‘Cases are on the rise in France. Therefore, cases are on the rise in every European country’). This indicates that the participants do not interpret O > G inferences in the same way in which they interpret an invalid entailment. Rather, our results seem to suggest that G > O and O > G trigger different types of inferences.

Thus, we think that there is room for more nuanced versions of H2, as well as other potential explanations. In the remainder of the chapter, we shall discuss two hypotheses that, we think, can account for our data. The two hypotheses that we explore are mutually compatible. What is more, they do not exclude that there could be other explanations as well, some of which may appeal to contents that are pragmatically implicated, rather than semantically entailed. For reasons of space, we shall set such pragmatic explanations aside.

4.1 A Difference in the Context-Sensitivity of ‘Good’ and ‘Ought’

In the introduction, we discussed the possibility that good and ought are context-sensitive expressions. There, we insisted on the similarities between their context-sensitivity: both expressions admit of similar “flavors” of evaluation, and are circumstance-dependent. However, it could be that some of these parameters are determined differently for ‘good’ and ‘ought’. In particular, ‘good’ and ‘ought’ could be circumstance-sensitive in different ways. Suppose that ‘ought’ denotes what is beneficial in the current circumstances, while ‘good’ denotes a stable, long-lasting benefit. This implies that what is good in general is something that ought to be the case in the current circumstances, but not necessarily the other way round: what ought to be the case in current circumstances need not be good in general. Recall the two basic directions of inference:

  1. (7)
    1. a.

      You ought to wear a mask in public to prevent the spread of COVID-19. Therefore, wearing a mask in public is good to prevent the spread of COVID-19.

    2. b.

      Wearing a mask in public is good to prevent the spread of COVID-19. Therefore, you ought to wear a mask in public to prevent the spread of COVID-19.

Intuitively, when one thinks of (7a), it is possible to accept the premise because we are in very special circumstances: one ought to wear a mask because—suppose—we are in a particularly acute phase of the pandemic, and we are in a state of a global sanitary crisis. But one may reject the conclusion because, other things being equal, being ‘good’ seems to require a more general assessment; that is, something is good if it is beneficial in the long run, or in general circumstances.Footnote 13 The fact that wearing a mask is beneficial in the actual, special circumstances does not guarantee that it is beneficial in general, or that it has benefits beyond what is required by the current circumstances. By contrast, if something is good, then it has benefits that apply to circumstances in general, and thus (most likely) apply to the current circumstances as well.

The simplest way to capture this circumstance-sensitivity would be to say the following:

φ-ing is good’ is true iff φ-ing is beneficial in general

‘One ought to φ’ is true iff φ-ing is beneficial in the current circumstances

Clearly, the inference from something being beneficial in general to something being beneficial in the actual circumstances is stronger than the inference from something being beneficial in the actual circumstances to it being beneficial in general.

Note however, that even though this is a way of modelling an asymmetry between ‘good’ and ‘ought’, it does not yet account for our results. Recall that we observed that both directions, G > O and O > G, are largely acceptable. Is this hypothesis compatible with the overall acceptability of these inferences? We think that it is. But showing this requires saying something about inferences involving generics.

If, as suggested by this hypothesis, the contrast between good and ought reduces to the contrast between going from a generally-quantified statement to a particular instance and vice versa, our hypothesis would be bolstered if such inference patterns showed a similar asymmetry. Consider the following inference:

  1. (10)
    1. a.

      Dogs have 4 legs. Therefore, this dog has 4 legs

    2. b.

      This dog has 4 legs. Therefore, dogs have 4 legs.

Obviously, (9a) holds promise for being a sensible inference, but less so the other way around. However, there is a sense in which an inference like (9b) might be taken to be acceptable. After all, we learn about general properties of things via observation of particular instances. Depending on the property or feature involved, the observation of a single instance might be sufficient to make a generalization.

This contrast opens up the possibility that the inference from a generalization to a particular instance is almost as strong as an entailment (because generalizations admit exceptions, of course), but in the other direction, it becomes a weaker form of inductive inference.Footnote 14

We think that this might account for the observed asymmetry between O>G and G>O inferences, repeated here:

  1. (7)
    1. a.

      You ought to wear a mask in public to prevent the spread of COVID-19. Therefore, wearing a mask in public is good to prevent the spread of COVID-19.

    2. b.

      Wearing a mask in public is good to prevent the spread of COVID-19. Therefore, you ought to wear a mask in public to prevent the spread of COVID-19.

In (7b), according to our hypothesis, we would have a quasi-entailment from a generalization (wearing a mask is beneficial in general to prevent the spread) to a particular instance (wearing a mask is beneficial in these circumstances to prevent the spread). By contrast, in (7a) we would find the opposite, an inference from a particular instance (beneficial in these circumstances) to a generalization (beneficial in general).

Pursuing this hypothesis further would require testing the inferential patterns involving good and ought against the relevant generic > particular and particular > generic control cases. For the moment, we must leave this possibility for future work.Footnote 15

4.2 The Prescriptive Character of Deontic ‘Ought’

A different hypothesis to account for the observed asymmetry is the idea that evaluatives and deontics differ in their illocutionary force: while evaluatives are declarative statements, deontics are a kind of prescriptive statement. Prescriptive statements are those that issue commands, directions or recommendations, among other things. This includes, most notably, imperatives and hortatives:

  1. (11)
    1. a.

      Wear a mask in public! (imperative)

    2. b.

      Let’s wash our hands afterwards. (hortative)

Contrary to declaratives, prescriptive statements cannot be embedded in certain environments, such as e.g., the antecedent of conditionals (* marks ungrammaticality):

  1. (12)
    1. a.

      * If wear a mask in public! then, p

    2. b.

      * If let’s wash our hands afterwards, then, p.


However, they are perfectly fine in the consequent of a conditional:

  1. (13)
    1. a.

      If p, then wear a mask in public!

    2. b.

      If p, then let’s wash our hands afterwards.

Relatedly, prescriptive statements appear preferably as the conclusion, rather than as premises, of arguments.Footnote 16 Consider the contrast between these two arguments (# indicates that the construction is infelicitous):

  1. (14)
    1. a.

      # Wear a mask in public! Therefore, p.

    2. b.

      # Let’s wash our hands afterwards. Therefore, p.

It sounds odd to follow up a command or recommendation with a declarative consequence. The most natural direction is the opposite, that is, to conclude an argument with a prescription (see Lewiński, 2021 for a recent study on the consequences of practical arguments):

  1. (15)
    1. a.

      p. Therefore, wear a mask in public!

    2. b.

      p. Therefore, let’s wash our hands afterwards.

Deontic modals, and in particular, ‘ought’, are subject to similar—though not identical—distributional constraints. First, ‘ought’ is typically not found in the antecedent of a conditional (even if it is not outright ungrammatical in that position)Footnote 17:

  1. (16)
    1. a.

      If one ought to wear a mask in public, then p.

    2. b.

      If we ought to wash our hands afterwards, then p.

By contrast, it is perfectly common as a consequent:

  1. (17)
    1. a.

      If p, then one ought to wear a mask in public.

    2. b.

      If p, then we ought to wash our hands afterwards.

Secondly, ‘ought’-statements are most commonly found as conclusions rather than as premises of arguments. Inference patterns in (18) sound significantly less natural than those in (9), which we have indicated by ‘??’.

  1. (18)
    1. a.

      ?? You ought to wear a mask in public. Therefore, p.

    2. b.

      ?? We ought to wash our hands afterwards. Therefore, p.

  2. (19)
    1. a.

      p. Therefore, you ought to wear a mask in public.

    2. b.

      p. Therefore, we ought to wash our hands afterwards.

When we do find ‘ought’ as a premise, the conclusion is most naturally interpreted as an inference to the best explanation. For example, take p in (18a) to be ‘Masks must be efficient in filtering out potential contaminants’. We then likely understand (18a) as conveying that the efficiency of masks in filtering the contaminants explains why the prescription of wearing masks in public has been issued in the first place.

We think that these observations go some way towards explaining the observed asymmetry between O > G and G > O inferences. If ‘ought’ appears most naturally as a conclusion, rather than as a premise, then this might be driving some participants’ rejection of O > G inferences.

Nevertheless, this hypothesis stands at odds with the overall acceptability of both patterns. To account for that, we would still need to assume that, at some general level of description, deontics and evaluatives make similar argumentative moves, thereby allowing speakers to make inferences in both directions. However, if we combine the hypothesis with the observation that deontics and evaluatives have roughly the same “flavors”, are context-sensitive in similar ways, and both belong in a broader class of normative expressions, then we can have our cake and eat it, too. That is, we predict overall acceptance of both inference patterns, but together with the observation that ‘ought’ occurs more naturally as a conclusion than as a premise, we also predict that G > O inferences will be seen as more acceptable than O > G inferences.

Finally, this hypothesis is, to some extent, compatible with the idea explored in Sect. 3.4.1 that ‘good’ and ‘ought’ are circumstance-sensitive in different ways. Similarly to deontics, imperatives may be seen as circumstance-specific in that they issue commands and requests to be performed “here and now”, so to speak. If ‘good’, by contrast, denotes things to be done not only here and now but in general, then it is easy to see how one can go from ‘good’ to an imperative, but not the other way around.

5 Conclusion and Prospects for Future Research

Argumentation in public policy abounds with normative expressions. When a new policy, or a change in the existing ones, is proposed, what we typically see are arguments whose conclusions take the form of a deontic statement, or a prescription, namely, what one ought to do. In this chapter, we have explored arguments with normative force with a special focus on the interaction between two kinds of normative expressions: evaluatives versus deontics, that is, what is good versus what one ought to do. While philosophers in metaethics have been interested in the relationship between the evaluative and the deontic realm, and relatedly, between values, on the one hand, and norms and obligations, on the other, this interest has not reached deep into the philosophy of argumentation. In deontic logic, there have been explorations of the relationship between ‘ought-φ’ and ‘φ-ing is the best option’, yet hardly anything has been said on how ‘ought-φ’ relates to ‘φ-ing is good’. Here, we have undertaken first steps in trying to understand this relationship.

Our research brings together theoretical and empirical investigation. The experimental study that we have presented aimed at understanding how people move between what is good and what ought to be the case, when it comes to argumentation. The study has shown, in a nutshell, that both directions are deemed to be acceptable. Nevertheless, it has also revealed an interesting asymmetry: arguing from what is good to do to what one ought to do is deemed more acceptable than the other way round. Interestingly, this asymmetry shows up most clearly in those cases in which participants have no personal opinion regarding the truth of the premises or conclusions.

We have presented two sets of considerations that may be able to explain the observed results. The first relies on the idea that ‘good’ and ‘ought’ are sensitive to context in slightly different ways. The latter typically makes reference to the current circumstances, the former purports to apply more generally. This explains why the inference from ‘good’ to ‘ought’ is overall more robust, while the converse, although less robust, is still deemed acceptable. The second relies on the idea that evaluative and deontic statements differ in their illocutionary force; the former are assertions, the latter, prescriptions. Since prescriptions are more likely to occur as conclusions, rather than premises, of an argument, we are again able to explain the observed asymmetry. The two sets of observations are mutually compatible, and do not preclude that there may be other elements relevant to understanding the relationship between the two types of propositions and statements. In particular, there may well be some pragmatic effects that have gone unnoticed. To pursue all these explanations further, we would want to compare the inferences involving ‘good’ and ‘ought’ with other types of inferences, and to do so both on a theoretical and an empirical level. We hope to undertake this task in the future.