Choiceworthiness
A finite set \(A = \{a_{1}, \ldots , a_{|A|}\}\) of “feasible acts” presents itself. There is a finite set of “possible states” \(S = \{s_{1}, \ldots , s_{|S|}\}\) to which I assign positive probability.Footnote 2 I assign utilities to performing each act in each state, as represented by the utility function u(A, s), where the value assigned to each act is not necessarily independent of the alternatives in A.Footnote 3 I also find myself in an overall finite epistemic position e, specifying the probabilities I assign to all relevant claims.Footnote 4 Let us call \(\pi = \langle A, u, e \rangle \) my “choice problem”.
Definition 2.1
A choice problem \(\pi \) is a triple of (i) a set of acts A, (ii) an epistemic position e, and (iii) a state-contingent utility function u over A.
We will say that my utility function specifies the “objective choiceworthiness” (or simply “choiceworthiness”) of each act, conditional on each state. That is, given s, the choiceworthiness of \(a_{i}\) is \(u(A, s)_{i}\)—the ith element of the |A|-vector u(A, s). From the probabilities I assign to the states in S, therefore, I also assign probabilities to potential values of the objective choiceworthiness of each act.
Definition 2.2
A (finite) choiceworthiness distribution is a (finite) probability distribution over choiceworthiness values for some (finite) set of acts.
Let \({\mathbb {D}}^{n}\) denote the set of all finite probability distributions in \({\mathbb {R}}^{n}\), and let some \(d(\pi ) \in {\mathbb {D}}^{|A|}\) represent the choiceworthiness distribution entailed by \(\pi \).
Many other finite probability distributions over \({\mathbb {R}}^{|A|}\) might do just as well as the chosen d at representing my finite choiceworthiness distribution. Exactly which others depends on how much structure is contained in our understanding of “utility”. If we understand utility to be a merely “ordinal” quantity, for instance, then any transformation of d that is monotonic in choiceworthiness (and constant in probability) represents the same choiceworthiness distribution. We are here assuming nothing about utility except that it at least partially orders act-state pairs from a given {feasible set \(\times \) possible set}, and that \({\mathbb {R}}\) is “rich enough” to capture any potential difference between the choiceworthiness values of particular act-state pairs—that choiceworthiness cannot be lexicographic, for instance. As discussed at the end of Sect. 2.5, these assumptions about choiceworthiness, here made implicitly, will follow from similar assumptions made explicitly about subjective choiceworthiness.
Subjective choiceworthiness
I am uncertain about acts’ choiceworthinesses. Even so, I may know that one act is the most appropriate for me to choose, given my epistemic position. As I write this, for instance, I assign high probability to the event that, if I go to the doctor, I will swiftly be cured of my back injury (an outcome I would prefer immensely to the status quo), and low probability to the roughly complementary event that, if I go to the doctor, I will waste some time and remain injured (an outcome to which I would slightly prefer the status quo). Despite this uncertainty, and all my other uncertainty, I am in fact certain that going to the doctor is the “better choice” for me right now (by far!). There is thus some scale on which the act of going to the doctor scores higher for me than the act of not going—and would score higher for anyone with the same utility function, in the same overall epistemic position, facing the same set of feasible acts. Let us call this scale “subjective chiceworthiness”.
It is my intuition that subjective choiceworthiness c, when well-defined, is fundamentally a cardinal scale. That is, I would maintain that a representation of acts’ subjective choiceworthinesses (for an agent in a given situation) in \({\mathbb {R}}^{|A|}\) would be unique at least up to affine transformation. If my feasible act set \(A = \{a_{1}, a_{2}, a_{3}\}\) consists of going to the doctor (\(a_{1}\)), going to a very slightly less competent doctor (\(a_{2}\)), or not going at all (\(a_{3}\)), then there is some important and foundational sense in which, given my epistemic position and my preferences, the distance between \(c(\pi )_{1}\) and \(c(\pi )_{2}\) is less than the distance between \(c(\pi )_{2}\) and \(c(\pi )_{3}\). It might be objected that I will always do whatever winds up being most subjectively choiceworthy; that therefore, in the absence of a specified theory of decision-making under uncertainty, no information is conveyed by postulated differences between the acts not chosen; and that c is therefore better understood as merely an ordering, or perhaps even as a choice relation. To this it might be replied that, under certain circumstances, differences in subjective choiceworthiness could bear some relationship to the subjective probability with which a subjectively sub-optimal act would become optimal upon further reflection. Or that cardinal subjective choiceworthiness takes on a clearer meaning in other situations of normative uncertainty (i.e. one might not choose the most subjectively morally choiceworthy act, and might in some sense be more blameworthy the less subjectively morally choiceworthy one’s act was)—and that it would be strange for subjective choiceworthiness to be fundamentally cardinal in one of these situations but not the other. Or that our models of the world are generally simpler when we extend our intuitions regarding quantities’ cardinality beyond the domains in which they happen to be testable—such as our intuition that temperature is generally cardinal, even on some cold, distant star that we will only discover if its temperature rises above some threshold.
Furthermore, cardinal subjective choiceworthiness allows for the convergence results described below, and less structured interpretations of subjective choiceworthiness would not. If we are otherwise persuaded that the regress problem must have some solution or other, it is not circular to allow this observation itself to lend credibility to the concept of cardinal subjective choiceworthiness.
In any event, for the purposes of this analysis, we will understand subjective choiceworthiness (again, when well-defined) to be cardinal. We will represent it by a “subjective choiceworthiness function” \(c(\pi )\), where c assigns a real number to the subjective choiceworthiness of each of the acts in a feasible set A, for an agent with a utility function u, in epistemic position e.
Note that by having c map into the real numbers, we are assuming that all information about differences in subjective choiceworthiness (and therefore utility) can be captured by ratios of differences in real numbers. We are here explicitly assuming for subjective choiceworthiness what we provisionally assumed above for objective choiceworthiness—that, for instance, it cannot be lexicographic. Like probability theories that let us condition on probability 0 events, utility theories that let us distinguish between acts that differ infinitesimally in choiceworthiness may also be interesting to consider in light of the regress problem. However, we will not touch them here.
Metachoiceworthiness
In general, if I am to translate a choiceworthiness distribution d into a determination of how to act, I must invoke a “decision theory”: a collection of claims concerning how to evaluate acts in light of one’s choiceworthiness distribution. For example, using this terminology, one decision theory is “Expected Choiceworthiness Theory” (EC). EC is characterized by the fact that, if I am certain that it is the correct decision theory, then each act’s subjective choiceworthiness for me is its expected choiceworthiness under d.Footnote 5 Another decision theory would be “minimum choiceworthiness”—a theory characterized by the fact that, if I am certain that it is the correct decision theory, then each act’s subjective choiceworthiness for me is its minimum possible choiceworthiness under d.
Just as I am uncertain about the true state of the world, I may also be uncertain about the correct decision theory. To come to a determination of how to act, therefore, I may have to invoke a sort of “meta decision theory” (or, “2-metatheory”): a collection of claims concerning how to respond to one’s uncertainty over decision theories.
Note that, since this is so, the decision theories (we will awkwardly call these “1-metatheories”, for ease of indexing) cannot themselves be claims about subjective choiceworthiness. This is perhaps a surprising claim, so it bears repeating: expected utility theory (for example) is not, in this language, a theory about what subjective choiceworthiness is, or even about what it ought to be “all things considered”. It is, rather, a theory about what subjective choiceworthiness “1-ought” to be, for someone with a given objective choiceworthiness distribution over his feasible set—or, a theory about what subjective choiceworthiness is for someone with a given objective choiceworthiness distribution over his feasible set, if he knows the true 1-metatheory.
Suppose, for instance, that I am faced with three feasible acts, that I assign probability to each of two 1-metatheories, \(t_{1}\) and \(t_{2}\), and that I am certain of “2-metatheory” m. The theories are such that if I were certain of \(t_{1}\), the subjective choiceworthinesses of the acts would be ordered \(a_{1} \succ a_{2} \succ a_{3}\); if I were certain of \(t_{2}\), the subjective choiceworthinesses of the acts would be ordered \(a_{3} \succ a_{2} \succ a_{1}\); and, given the probabilities I assign to \(t_{1}\) and \(t_{2}\), but my certainty about m, the subjective choiceworthinesses of the acts are in fact ordered \(a_{2} \succ a_{3} \succ a_{1}\). Although I assign probability \(\frac{1}{2}\) to \(t_{1}\), I assign no positive probability to the event that \(a_{1}\) is more subjectively choiceworthy than \(a_{2}\) from my epistemic position. The 1-metatheories’ claims, therefore, are not claims about the acts’ subjective choiceworthinesses given my empirical uncertainty, but about how the acts score on an altogether different scale. Let us call this scale “metachoiceworthiness”, or “1-metachoiceworthiness”. Of course, metachoiceworthiness must be constructed such that, if I know that an act’s 1-metachoiceworthiness is x, then the act’s subjective choiceworthiness for me is also x. We might therefore informally think of 1-metachoiceworthiness as “whatever subjective choiceworthiness is, for someone who knows the correct 1-metatheory”. But since, again, decision theories are not actually claims about subjective choiceworthiness, let us begin by thinking about 1-metachoiceworthiness on its own terms, and only afterward consider its relationship to subjective choiceworthiness.
In any event, the elusiveness of subjective choiceworthiness is not restricted to “order 1”. Just as I may be uncertain as to the correct 1-metatheory, I may be uncertain as to the correct 2-metatheory; I may therefore have to appeal to a “3-metatheory”; and the 2-metatheories are therefore making claims not about acts’ subjective choiceworthiness given beliefs about their 1-metachoiceworthiness, but about acts’, say, “2-metachoiceworthiness” given beliefs about their 1-metachoiceworthiness. So our regress begins.
k-metachoiceworthiness
Let us call choiceworthiness “0-metachoiceworthiness”, choiceworthiness distributions “0-metachoiceworthiness distributions”, and decision theories “1-metatheories”. The concepts of k-choiceworthiness, k-metachoiceworthiness distributions, and k-metatheories can then together be defined recursively.
Definition 2.3
The k -metachoiceworthiness \(c_{k}\) of an act \(a_{i}\), for an agent facing finite choice problem \(\pi \), is \(a_{i}\)’s subjective choiceworthiness for an agent with the same (\(k - 1\))-metachoiceworthiness distribution as that entailed by \(\pi \), but who knows the correct k-metatheory.
Let us denote acts’ relative k-metachoiceworthiness by the two-place relation \(\succ _{\pi ,k}\).
Definition 2.4
A (finite) k -metachoiceworthiness distribution \(d_{k} \in {\mathbb {D}}^{|A|}\) is a probability distribution over k-metachoiceworthiness values for some (finite) set of acts A.
Definition 2.5
A k -metatheory, applied to a finite set of acts A, is a function \(t_{k} : {\mathbb {D}}^{|A|} \rightarrow {\mathbb {R}}^{|A|}\), representing claims about the k-metachoiceworthiness of the acts in A given (\(k-1\))-metachoiceworthiness distribution \(d_{k-1} \in {\mathbb {D}}^{|A|}\).
Note that, strictly speaking, if we want our k-metatheories to make k-metachoiceworthiness claims over finite act-sets of arbitrary size, we would have to say that a k-metatheory is a family of functions \(\{t_{k}^{n}\}\) from \({\mathbb {D}}^{n}\) to \({\mathbb {R}}^{n}\), with one for each \(n \in {\mathbb {N}}\). For simplicity, however, we will take \(n = |A|\) as given and interpret our project only as an attempt to find criteria under which the subjective choiceworthinesses of any n acts will be well-defined—with the understanding that identical reasoning would apply to any other n.
We can now define a few aditional terms.
Definition 2.6
A k -metatheory distribution \(d_{t_{k}}\) is a probability distribution over k-metatheories.
Definition 2.7
A metatheoretic hierarchy (or simply “hierarchy”) T is a collection of k-metatheories \(t_{k}\) with one for each \(k > 0\).
Definition 2.8
A hierarchy distribution \(d_{T}\) is a probability distribution over hierarchies.
Let \(|d_{t_k}|\) and \(|d_T|\) denote the number of k-metatheories and hierarchies, respectively, to which I assign positive probability.
Let \(\vec {c_k} \in {\mathbb {R}}^{|d_{t_k}||A|}\) represent the claims made by my \(|d_{t_{k}}|\) k-metatheories about the k-metachoiceworthinesses of the |A| acts in A. Let \(\vec {p_{k}} \in \Delta ^{|d_{t_{k}}|-1}\) represent the probabilities I assign to these k-metatheories. We can now represent my k-metachoiceworthiness distribution by \(d_{k} = \langle \vec {c_{k}},\vec {p_{k}}\rangle \).Footnote 6
The relationship of k-metachoiceworthiness to subjective choiceworthiness
Upon introducing the cardinal subjective choiceworthiness function \(c(\pi )\) above, we placed no restrictions on what it could be. Now that we have documented the emergence of an elaborate web of concepts concerning \(\pi \), however, we can consider how it relates to c.
Recall that k-metachoiceworthiness claims are defined so that, if I know that an act’s k-metachoiceworthiness for me is x, the act’s subjective choiceworthiness for me is x. Let us now introduce a compatible, minimally restrictive principle with which one’s subjective choiceworthiness function might comply in the face of uncertainty about an act’s k-metachoiceworthiness.
Definition 2.9
The Dominance Principle is the principle that
-
If \(b \ge x\) \(\forall b \in [\vec {c_k}]_i\), and \(b^{*} > x\) for some \(b^{*} \in [\vec {c_k}]_i\), then \(c(a_i) > x\).
-
If \(b \le x\) \(\forall b \in [\vec {c_k}]_i\), and \(b^{*} < x\) for some \(b^{*} \in [\vec {c_k}]_i\), then \(c(a_i) < x\).
Note that if I accept the Dominance Principle, it follows immediately that my subjective choiceworthiness for an act \(a_i\) is well-defined whenever \(|\cap _{k \in {\mathbb {N}}}[\min ([\vec {c_k}]_i), \max ([\vec {c_k}]_i)]|=1\). That is, whenever exactly one number lies in the ranges of “admissible” (not dominated) k-metachoiceworthiness values, across all k, for an act, that number must be the act’s subjective choiceworthiness.
Note also that any claim about subjective choiceworthiness itself, such as the Dominance Principle, in some sense takes on both a positive and a normative interpretation. One could interpret the Principle normatively as asserting that one’s subjective choiceworthiness always ought to obey the above pattern. In this case, if one accepts the Principle, one’s subjective choiceworthiness also does obey it, since to hold that an act should be ranked highly for someone in your epistemic position is simply another way to say that it is highly subjectively choiceworthy. Alternatively, one could interpret the Principle positively as asserting that, as a matter of fact, subjective choiceworthiness always obeys the above pattern. If one accepts this claim (and that “ought implies can”), one must also accept that subjective choiceworthiness always ought to obey the above pattern. Either way, if one accepts the Principle, one cannot assign positive probability to k-metatheories that claim that the k-metachoiceworthiness of an act lies outside the admissible range imposed by one’s \(k^{\prime }\)-metachoiceworthiness distribution for the act for lower orders \(k^{\prime } < k\).
Finally, note that the framework outlined here differs from other approaches to subjective choiceworthiness in the following respect. Some other approaches [e.g. that of MacAskill (2016a)] begin with the normative theories in all their diversity; work through problems of intertheoretic comparability; and then try to define subjective choiceworthiness with no more structure than necessary. On some accounts, this minimal structure allows only for a binary classification of acts into the “permissible” and the “impermissible” [as recommended, for instance, by Barry and Tomlin (2016)]. The above approach, by contrast, begins by assuming that subjective choiceworthiness is a single cardinal scale, and it characterizes k-metachoiceworthiness claims, and the k-metatheories that make them, in terms of the subjective choiceworthinesses that they would induce if they were known. This approach has the cost of assuming cardinal subjective choiceworthiness, but it has the benefit of immediately giving all my k-metachoiceworthiness claims both unit and level comparability, without requiring any further assumptions.
Thus, from a cardinal definition of subjective choiceworthiness, we also get a cardinal definition of utility, without having to assume it explicitly. By similar reasoning, we also get cardinal definitions of k-metachoiceworthiness for all k. Note that we are not taking the Von Neumann–Morgenstern approach of defining my utility function so that it represents the choices I would make if I were maximizing expected utility; indeed, our project is to explore how far I can stray from certainty about expected utility theory while still knowing how I subjectively ought to act.