1 Introduction

The veil of ignorance is an important theoretical construction for moral and political philosophy. For Rawls, just arrangements of social institutions and general principles that govern them are those that members of a society would choose under a hypothetical state of ignorance about the citizens’ individual social and economic standing, their abilities, etc. (1971). The idea is that such ignorance insulates choice from bias and unfair tailoring of policies to special interests. Similarly, in making his case for utilitarianism, Harsanyi assumes that an agent choosing between social systems without knowing his or her position in them is making a moral value judgment. On the question of the just distribution of income, he writes, “a value judgment on the distribution of income would show the required impersonality to the highest degree if the person who made this judgment had to choose a particular income distribution in complete ignorance of what his own relative position (and the position of those near to his heart) would be within the system chosen” (1953, pp. 434–435, emphasis mine). “This,” Harsanyi reflects later, “is my own version of the concept of the ‘original position”’ (1975, p. 598).

Prior to either Harsanyi’s or Rawls’s invocation of some sort of original position, Abba Lerner appealed to something of a conceptual forerunner of the veil of ignorance in the setting of an interesting theorem concerning distributive justice (1946).Footnote 1 Suppose that there is a fixed sum of money to be distributed to individuals in a population. Importantly, Lerner stipulates, the distribution must be chosen under ignorance as to who has which utility function. Given Lerner’s particular formulation of ignorance and assumptions about individual and social welfare, the equal distribution maximizes expected social welfare. If the goal is to maximize expected social welfare, then Lerner provides an argument for an egalitarian distribution of income.

Ignorance in the intended setting, according to Lerner, amounts to it being equally likely, for any two people in the population, that one person possesses a given utility function as that the other possesses it. Ignorance, in other words, is represented as all matchings of utility functions to individuals being equally probable. Harsanyi conceives of ignorance along similar lines. Continuing the passage quoted earlier, Harsanyi claims that an agent would choose an income distribution “in complete ignorance” of his own relative position in society “if he had exactly the same chance of obtaining the first position (corresponding to the highest income) or the second or the third, etc., up to the last position (corresponding to the lowest income) available within that scheme” (1953, p. 435, emphasis mine). This way of thinking about ignorance has been regarded with a good deal of skepticism in general—being in essence an application of the principle of indifference—and, as McManus et al. note, in the context of Lerner’s theorem in particular: “Of particular interest in the past has been Lerner’s assumption of complete ignorance about which individual has which utility function, and his jump from that assumption to the hypothesis that each possible matching is equally likely” (1972, p. 494). About the relevant probabilities, Graaff, for example, writes, “There is no justification for assuming them equal. From absolute ignorance we can derive nothing but absolute ignorance” (1967, p. 100 fn). Similarly, Little, remarking on Lerner’s theorem, claims, “From complete ignorance nothing but complete ignorance can follow” (2002, p. 59).

How should we model ignorance in Lerner’s intended setting? Here, I consider Lerner’s theorem in the setting of imprecise probabilities (IP). By the lights of many, IP allows for a more general and compelling representation of uncertainty and of “complete ignorance” in particular than standard probability models do (e.g., Keynes 1921; Good 1952; Levi 1974; Walley 1991; Kaplan 1996; Joyce 2011; Hájek and Smithson 2012; Weatherson 2015; Stewart and Ojea Quintana 2018; Hill 2019). So it is interesting and natural, it would seem, to consider arguments that appeal to some version of the veil of ignorance or related ideas—as in Lerner’s theorem—in this more general setting. This allows us to perform a robustness check of sorts. Lerner’s particular representation of ignorance, the equi-probability assumption, constitutes a sticking point in the appreciation of his theorem, as we have just seen. To what extent does his conclusion depend on this assumption? Some robustness analysis has been performed, with theorems that establish the equal distribution of income as optimal according to the maximin rule for decisions under ignorance (Theorem 2). But this rule is extremely conservative, moves decidedly away from the maximization of expected utility, and is susceptible to certain intuitive counterexamples (e.g., Luce and Raiffa 1957, pp. 279–280).

After rehearsing the basics of the framework and stating a general version of Lerner’s theorem due to Sen in Sect. 2, I introduce and motivate IP as a framework for representing and reasoning about uncertainty in Sects. 3 and 4. The primary conceptual move is to employ IP theory to model ignorance in the context of ethical arguments that exploit ignorance, e.g., veil of ignorance arguments or, in the case at hand, Lerner’s theorem.Footnote 2 A crucial issue here is that there are several candidate decision criteria for IP, each of which can claim to generalize expected utility maximization. As a result, the admissibility of the egalitarian distribution is a more subtle issue than in the standard setting. In the IP setting, we find that, from the assumption of total ignorance, we do not automatically arrive at egalitarianism, partially vindicating those skeptical of Lerner’s “jump” from complete ignorance to equal probability. It depends, in part, on the IP generalization of expected utility that we adopt. But, pace Little and Graaff, it is not true that we arrive at nothing from total ignorance. Even for the most extreme form of ignorance in the IP setting, what we arrive at depends, again, on the decision rule that we adopt. The admissibility of the equal distribution is fairly robust for IP decision rules under complete ignorance (Observation 1). Moreover, certain properties of social welfare that admit some ethical motivation entail the unique admissibility of the equal distribution under complete ignorance for some decision criteria (Observation 2). On the other hand, certain intuitively very unjust distributions are likewise admissible for some of the decision rules considered (Example 1). To the extent one finds the relevant IP decision theory compelling, this calls into question the inference from a distribution’s choiceworthiness under ignorance to its status as fair or just.

2 (Sen’s version of) Lerner’s theorem

In presenting Lerner’s theorem, I follow Sen’s more general version (1973). Sen’s version of the theorem generalizes away from the assumption that social welfare is additive or even separable, which already responds to certain objections to Lerner’s original formulation (Friedman 1947). We assume that there is a group N of individuals, \(i = 1, \dots , n\). We also assume that there is a collection of n utility functions, \(U^1, \dots , U^n\).Footnote 3 It is unknown—and this is the crucial point—which utility function is associated with which individual. Let y be any income vector \((y_1, \dots , y_n)\), with \(y_i\) denoting the income of individual i. Let z be a vector of equal incomes, \(z_i = z_j\) for all \(i, j \in N\). A function of n arguments is called symmetric if the value of the function is constant under permutations of the arguments. For real vectors, we say \(y = (y_1, \dots , y_n) \ge y' = (y'_1, \dots , y'_n)\) if \(y_i \ge y'_i\) for all \(i \in N\). A function f is increasing if \(y \ge y'\) implies \(f(y) \ge f(y')\). A function \(f: \mathcal {S} \rightarrow \mathbb {R}\) defined on a convex subset of a real vector space is called concave if for all \(x, y \in \mathcal {S}\) and all \(\lambda \in [0, 1]\), \(f(\lambda x + (1 - \lambda ) y) \ge \lambda f(x) + (1 - \lambda ) f(y)\). Standardly, concavity of individual utility is taken to reflect the rate at which the marginal utility of income decreases and, in the context of decisions under uncertainty, risk aversion. Concavity of a social welfare function, on the other hand, reflects a form of inequality aversion. The assumptions for Sen’s generalization of Lerner’s theorem are as follows.

  1. (A.1)

    (Total Income Fixity). There is a fixed sum \(y^*\) to be distributed: \(\sum _{i = 1}^n y_i = y^*\).

  2. (A.2)

    (Concavity of the Group Welfare Function). Social welfare W, an increasing and symmetric function of individual utilities, is concave.

  3. (A.3)

    (Concavity of Individual Welfare Functions). Individual welfare functions are concave.

  4. (A.4)

    (Equi-probability). For each j, \(p^j_i = p^j_m\), for all \(i, m \in N\).

Here, \(p_i^j\) is the probability that agent i has utility function \(U^j\). One way to think about assumption A.4 is that a social planner tasked with deciding the distribution of income uses the uniform probability distribution over the possible matchings of utility functions to individuals.

For any income distribution y, let \(\tilde{y}\) be a permutation of y such that \(\tilde{y}^j\) is the income of the individual with the \(j^{th}\) utility function. There are n! ways of assigning n utility functions to n individuals. For each such assignment k, there is a particular permutation vector \(\tilde{y}(k)\) reflecting a particular assignment of utility functions to individuals (or, maybe more carefully, a matching of incomes with utility functions). So for any income vector y, the n! possible social welfare values are given by \(F(\tilde{y}(k))\), \(k = 1, \dots , n!\), where F is the reduced or compound function \(F(\tilde{y}(k)^1, \dots , \tilde{y}(k)^n) = W(U^1(\tilde{y}(k)^1), \dots , U^n(\tilde{y}(k)^n))\) for all y and all k (exploiting the symmetry of W to order profiles by the utility function index). A social planner facing the income distribution problem would know, for any given assignment k, that \(\tilde{y}^j\) is the income going to the person with utility function \(U^j\), but not know which individual in the population possesses that utility function. In general, the expectation of social welfare is

$$\begin{aligned} E_p(y) = \sum _{k = 1}^{n!} F(\tilde{y}(k))p(k) \end{aligned}$$
(1)

for some probability p. Under assumption A.4, expected social welfare is given by

$$\begin{aligned} E(y) = \frac{1}{n!} \sum _{k = 1}^{n!} F(\tilde{y}(k)). \end{aligned}$$
(2)

We can now state (Sen’s version of) Lerner’s theorem.

Theorem 1

Given (A.1), (A.2), (A.3), and (A.4), expected social welfare is maximized by z, the equal distribution of income.

As mentioned above, many find A.4 to be unmotivated as a representation of ignorance about who has which utility function. One attempt to respond to these concerns is to eschew equating ignorance with any probabilistic judgment whatsoever and to appeal to some decision theoretic alternative to expected utility theory. In particular, versions of a theorem to the effect that the equal distribution is a maximin strategy have been proved, demonstrating that Lerner’s conclusion is not as dependent on A.4 as it might initially appear (Sen 1969; McManus et al. 1972; Sen 1973). The maximin policy is to maximize the minimum possible level of social welfare, where the possibilities are the assignments of utility functions to individuals. Dispensing with A.4, Sen makes two further assumptions (1973, Theorem 2).Footnote 4

  1. (A5)

    (Shared Set of Welfare Functions). For any individual i and any utility function j, it is possible that i has j.

  2. (A6)

    (Bounded Individual Utility Functions). Each function \(U^j\) is bounded from below.

Theorem 2

Given (A.1), (A.2), (A.3), (A.5), and (A.6), the equal distribution z is the maximin strategy for social welfare.

As McManus et al. put it, “the worrisome hypothesis of equal probability is not necessary for Lerner’s conclusion” (1972, p. 494).

Table 1 A problem with maximin

However, maximin is itself of questionable normative status. For example, that \(o_2\) and not \(o_1\) is the maximin solution in the decision problem in Table 1 gives some pause about the general appeal of the rule for decisions under uncertainty (Cf. Luce and Raiffa 1957, pp. 279–280). Option \(o_1\)’s payoff from state 2 on can be increased arbitrarily and option \(o_2\)’s payoff can be decreased to any positive number (provided \(o_1\)’s payoff in \(s_1\) is still lower) without altering maximin’s verdict. Notice, however, that the force of this objection depends heavily on measurability assumptions about utilities. Still, with no probabilistic assessments of the states whatsoever, it is, at the very least, extremely difficult to evaluate such choices.

3 Complete ignorance

The theory of imprecise probabilities has at its disposal the representation of a form of ignorance more severe than any that admits representation when it is assumed that probability judgments are numerically precise. There are a number of IP frameworks, but I will work with arbitrary sets of probability functions. In this framework, an agent’s credal state is represented by a set \(\mathbb {P}\) of probability functions rather than a single probability function. In his review of Walley’s treatise on IP, Wasserman takes note of this “nice feature”: “A nice feature about [IP] is that there exists a transformation-invariant expression of ignorance, a holy grail in Bayesian statistics. To represent ignorance, we use the set \(\mathcal {P}\) of all probabilities” (1993, p. 701).Footnote 5 Many advocates of IP have found the set of all probabilities to be an eminently natural representation of complete ignorance. Isaac Levi, for example, calls such a state “probabilistically ignorant in the extreme sense” (1977). Since no probability distribution is ruled out, \(\mathbb {P} = \mathcal {P}\) reflects no information about the relevant possibilities.

The primary concern about this proposal for representing ignorance has to do with what is sometimes called belief inertia (Levi 1980; Vallinder 2018; Bradley 2019). After his flattering remarks, Wasserman continues, “It is tempting to conclude that this solves the problem of finding an objective prior for Bayesian inference. But a vacuous prior gives a vacuous posterior, no matter how much data we obtain. So we cannot represent ignorance after all, at least not in standard statistical problems [...] What a shame that we cannot drink from the grail” (1993, p. 701). But belief inertia isn’t the slightest problem for using this representation of ignorance to carry out an exercise of making hypothetical choices behind the veil of ignorance.Footnote 6 There is no concern to learn or update our “initial” ignorance in this context.

Let \(\mathbb {P}^j_i\) be the set of probabilities assigned to individual i having utility \(U^j\). I propose to consider replacing Lerner’s equi-probability assumption A.4 with the following assumption.

  1. (A.7)

    For each j, \(\mathbb {P}^j_i = [0,1]\) for all \(i \in N\).

One way to think about assumption A.7 is that a social planner responsible for the choice of income distribution uses the same set of probability distributions over the possible utility functions—the set of all distributions—for each individual. A.7 immediately implies that \(\mathbb {P}^j_i = \mathbb {P}^j_m\) for all \(i, m \in N\), in analogy to the equi-probability assumption A.4. A.7 also implies A.5 since, for any utility function j and individual i, \(\mathbb {P}^j_i\) includes positive values (even 1) for the possibility that j belongs to i, which is not the case when it is not possible that i has j. On the one hand, A.7 is a very weak assumption, too weak, one might think, to derive something like Lerner’s egalitarian conclusion from. On the other hand, A.7 would seem to be far less open to the objections leveled against the equi-probability assumption, A.4, as a representation of ignorance. However, Observations 1 and 2 below show that, together with the other assumptions discussed, interesting content remains.

We have assumed a fixed, finite set of utility functions—in fact, of the same cardinality as the population (but see footnote 3). But there could be greater uncertainty about the appropriate set of possible utility functions. A larger set of possible individual utility functions would imply a larger set of possible assignments of utility functions to individuals. So, we could be confronted with a state of what Levi calls “modal ignorance” that is more severe than we have assumed (1977). (A smaller set of possible assignments of utility functions to individuals is also possible—if the social planner knows that one particular utility function belongs to a particular individual, for example—and would be a less severe state of modal ignorance than we are assuming.) In such a state of greater modal ignorance, we could still face probabilistic ignorance regarding the relative likelihoods of the various assignments. While I will not pursue this issue here, extending Lerner’s theorem to significantly greater states of modal ignorance may provide for an even more severe and convincing robustness check on Lerner’s egalitarian argument.Footnote 7

4 Choice under IP

Part of the debate between Harsanyi and Rawls concerns, not only the form of ignorance faced, but the appropriate decision rule to use behind the veil of ignorance. Where Harsanyi advocates the maximization of expected utility, making certain assumptions of equi-probability, Rawls advocates a non-probabilistic rule, maximin, and denies the legitimacy of informative probabilistic judgments about social standing.Footnote 8 The mere introduction of IP forms of ignorance—which are in a sense probabilistic without necessarily being as precise as those Rawls objects to—does not settle the issue of the appropriate decision criterion with which to pair such representations, even if we retain a general commitment to the spirit of expected utility.

There are a number of candidates considered in the literature, each with a claim to generalize expected utility maximization. Since this is a matter of some controversy, and the subsequent analysis depends on it, let’s briefly review the primary proposals. Let Y be a given set of options (for simplicity, I suppress reference to a state space here). One of the more restrictive IP decision rules embodies some of the pessimism of maximin (e.g., Gilboa and Schmeidler 1989). \(\Gamma \)-maximin restricts choice to the set

$$\begin{aligned} \bigg \{x \in Y: \inf _{p \in \mathbb {P}} EU_p(x) \ge \inf _{p \in \mathbb {P}} EU_p(y)\ \text {for all}\ y \in Y\bigg \}, \end{aligned}$$

where \(EU_p(x)\) is the p-expected U-utility of x. (In the context of Lerner’s theorem, we drop the social welfare function W from our notation and write \(E_p(x)\) as in Eq. 1.) Another of the more restrictive IP decision rules is optimistic where \(\Gamma \)-maximin is pessimistic. The \(\Gamma \)-maximax options of Y are given by the set

$$\begin{aligned} \bigg \{x \in Y: \sup _{p \in \mathbb {P}} EU_p(x) \ge \sup _{p \in \mathbb {P}} EU_p(y)\ \text {for all}\ y \in Y\bigg \}. \end{aligned}$$

E-admissibility, a more liberal rule, was proposed by Isaac Levi (1974). E-admissible options maximize expected utility relative to some \(p \in \mathbb {P}\). That is, the E-admissible options are those in the set

$$\begin{aligned} \Big \{x \in Y: \exists p \in \mathbb {P}\ \forall y \in Y\ EU_p(x) \ge EU_p(y)\Big \}. \end{aligned}$$

More liberal still, Maximality (Walley 1991) enjoins us to choose options undominated in expectation, restricting choice to the set

$$\begin{aligned} \Big \{x \in Y: \forall y \in Y\ \exists p \in \mathbb {P}\ EU_p(x) \ge EU_p(y)\Big \}. \end{aligned}$$

Interval Dominance appeals to a weaker notion of avoiding dominated options. Admissible options according to this rule are those in the set

$$\begin{aligned} \bigg \{x \in Y: \sup _{p \in \mathbb {P}} EU_p(x) \ge \inf _{p \in \mathbb {P}} EU_p(y)\ \text {for all}\ y \in Y\bigg \}. \end{aligned}$$

\(\Gamma \)-maximin, \(\Gamma \)-maximax, E-admissibility, Maximality, and Interval Dominance all generalize expected utility maximization. When \(\mathbb {P} = \{p\}\), all rules amount to maximizing expected utility with respect to p. For other discussions of this rule set, see for example (Troffaes 2007; Chandler 2014; Huntley et al. 2014). Troffaes establishes that these decision rules stand in certain logical relations to each other generally, as depicted in Fig. 1 (2007, Theorem 1). An arrow represents the fact that admissibility according to one rule implies admissibility according to the rule the arrow points to.

Fig. 1
figure 1

Relations among IP Decision Rules

Since E-admissibility is a rather permissive decision rule—Maximality and Interval Dominance even more so—some have considered lexicographic choice procedures that would narrow the class of admissible options further. One candidate often considered in this context is E-admissibility \(+\) \(\Gamma \)-maximin (abbreviated \(E + \Gamma \) in Fig. 1) (Levi 1986; Seidenfeld et al. 2012). This rule first restricts choice to E-admissible options, and then applies \(\Gamma \)-maximin to the set of E-admissible options as a tie-breaking procedure. What are the logical relations between this rule and those considered above? While the admissibility of an option according to E-admissibility \(+\) \(\Gamma \)-maximin clearly implies its E-admissibility, it is not true that it implies the option’s admissibility according to \(\Gamma \)-maximin (e.g., Seidenfeld 2004, Example 1). The following example demonstrates, among other things, that admissibility according to \(\Gamma \)-maximax does not imply admissibility according to E-admissibility \(+\) \(\Gamma \)-maximin.

Example 1

Let \(N = \{1, 2\}\) and \(y^* = \$ 100\). Let \(U^1(x) = x\) and \(U^2(x) = \sqrt{x}\). Both of these utility functions are concave (on their domains). Define \(W(u^1, u^2) = u^1 + u^2\), where \(u^j\) is the utility of the person with the \(j^{th}\) utility function. That W is increasing, symmetric, and concave is easily verified. For each j, let \(\mathbb {P}^j_i = [0, 1]\) for all \(i \in N\). Consider the distribution \(y = (99.75, 0.25)\). For some \(p \in \mathbb {P}\), \(E_p(y) \approx 10.24\), and for another \(p' \in \mathbb {P}\), \(E_{p'}(y) = 100.25\). For the equal distribution z, \(E_p(z) \approx 57.07\) for all \(p \in \mathbb {P}\). \(\triangle \)

Two comments on Example 1 are in order. First, distribution y guarantees that z is not a \(\Gamma \)-maximax option nor uniquely E-admissible.Footnote 9 The point about \(\Gamma \)-maximax suffices to establish the last claim of Observation 1 below. The point about the E-admissibility of y motivates the search for suitable assumptions that allow us to be more discriminating. Some such assumptions figure into Observation 2 below. Second, as mentioned just above, while y is a \(\Gamma \)-maximax option, it is not admissible according to E-admissibility \(+\) \(\Gamma \)-maximin since z will eliminate it at the second (\(\Gamma \)-maximin) stage. This helps us to further locate E-admissibility \(+\) \(\Gamma \)-maximin with respect to its logical relations to the other decision criteria.

We can now state our first observation.

Observation 1

Given (A.1), (A.2), (A.3), (A.6), and (A.7), the equal distribution z is admissible according to

  1. 1.

    \(\Gamma \)-maximin

  2. 2.

    E-admissibility

  3. 3.

    E-admissibility \(+\) \(\Gamma \)-maximin

  4. 4.

    Maximality

  5. 5.

    Interval Dominance.

However, z is not generally a \(\Gamma \)-maximax distribution.

Condition A.2 assumes that social welfare is concave. This is unnecessarily strong for some of the subclaims of Observation 1. For the claim about \(\Gamma \)-maximin, for instance, it suffices to assume that social welfare is quasi-concave.Footnote 10 But I would like to consider strengthening concavity or quasi-concavity to their strict versions for two reasons. First, such a strengthening has been thought to have important ethical ramifications for social welfare precisely because of its egalitarian nature. So it is already of theoretical interest here. Second, one might regard Observation 1 as less than a fully persuasive rational case for equal distribution. That z is admissible does not at all require its ultimate selection. Many other, decidedly non-egalitarian distributions may likewise be admissible. This is precisely the case in Example 1, where both the equal distribution and the lopsided (99.75, 0.25) distribution are E-admissible. As a result, the ethical upshot of Observation 1 is at most that z is permissible, not that it is ethically mandatory. As we will see, the strengthening has further interesting consequences in this respect.

Let’s begin with the ethical motivation for strict concavity/quasi-concavity. The following famous example is due to Diamond (1967), and was originally intended as a criticism of Harsanyi’s utilitarianism.

Example 2

Let \(N = \{1, 2\}\) and suppose that we have to allocate a single unit of an indivisible good. Consider the following two policies. The first policy allocates the good to individual 1 for sure. The second policy is to randomize between the allocations (1, 0) and (0, 1), each with equal probability of being selected. Suppose that \(U^i(1) = 1, U^i(0)=0\). If the interests of individuals 1 and 2 are given equal weight, then society will be indifferent between the two policies if the group welfare function is linear in individual utilities.Footnote 11 In defense of social indifference, one might say that under either policy, a distribution results allocating the good to just one individual. In criticism of social indifference, one might say that the second policy gives individual 2 a “fair shake” while the first policy does not (1967, p. 766). The crucial distinction here is about concern for outcomes alone versus concern for procedure. \(\triangle \)

On the basis of the sort of fairness consideration displayed by the second reaction to Diamond’s example—which might be viewed as a concern for equality of opportunity in one sense—Epstein and Segal construct a theory of social choice that relaxes expected utility at the social level and imposes a strict preference for randomization in scenarios like the one in Example 2 (1992). Their social preferences are strictly quasi-concave. (In particular, their theory entails that the \(.50-.50\) randomization is preferred to randomizing with any other probabilities in Example 2.)Footnote 12 We must note again that the force of Example 2 depends on both measurability and (strong) comparability assumptions about utilities.

In our setting, concavity and quasi-concavity reflect a social aversion to inequality in the distribution of utilities. Their strict versions reflect a stronger form of such aversion and rule out more options as inadmissible for social choice, as the following observation attests. However, most of the IP rules under consideration fail to secure the unique admissibility of the equal distribution in general—that is, for many sets of utility functions about which a social planner could be uncertain, z will not be uniquely admissible according to most of the rules under consideration.

Observation 2

Given (A.1), (A.3), (A.6), (A.7) and modifying (A.2) so that social welfare is strictly quasi-concave (resp. strictly concave), the equal distribution z is uniquely admissible according to \(\Gamma \)-maximin (resp. E-admissibility \(+\) \(\Gamma \)-maximin). However, z generally fails to be uniquely admissible according to \(\Gamma \)-maximax, E-admissibility, Maximality, and Interval Dominance.

We see clearly here how, in general, the case for Lerner’s egalitarianism depends delicately on the IP generalization of expected utility that we adopt. One strategy, not pursued here, is to find ways to discriminate among the various IP decision criteria in terms of normative attractiveness. Seidenfeld (2004) and Troffaes (2007), for example, both discuss this possibility, each finding an advantage for E-admissibility. Levi (1980) defends the E-admissibility of an option as a necessary condition for its rational selection, but thinks further tie-breaking criteria like a lexicographic application of \(\Gamma \)-maximin as in the rule E-admissibility \(+\) \(\Gamma \)-maximin are matters of individual discretion.

5 Discussion

Many ethical and political theories make crucial appeals to epistemic states of uncertainty and ignorance. Appropriate representation of such states, then, is an important issue in these contexts. This essay is a case study in applying the theory of imprecise probabilities in the general area of veil-of-ignorance-type arguments. Taking for granted the philosophical premise of the relevance of ignorance for evaluating issues impartially or fairly, we can perform a conceptual robustness check on Lerner’s argument for the egalitarian income distribution by considering other, more plausible representations of “complete ignorance.” There are two broad types of reactions one might have to the foregoing analysis.

First, one might take the present study as good news for egalitarianism. Observation 1 shows that Lerner’s conclusion regarding the optimality of the egalitarian income distribution is, in a particular sense, robust to variation between certain important representations of ignorance. We need not assume “the worrisome hypothesis of equal probability” to reach the admissibility of the equal distribution of income, but neither need we move to totally non-probabilistic decision rules and representations of ignorance either as Theorem 2 does. Even the extremely weak A.7 suffices for the admissibility of the equal distribution for most rules. However, for some IP choice rules, admissibility is quite a weak notion. In general, it is much weaker than optimality according to a weak order, for example. It is worth noting, as Observation 2 does, that the unique admissibility of the equal distribution can be secured without moving away from the assumption of complete ignorance. This conclusion now depends both on ethical assumptions about social welfare and on the IP decision rule that we adopt. But perhaps the ethical assumptions and decision rules that secure uniqueness can be motivated.Footnote 13

Second, one might take the present study as bad news for Lerner’s argument. While Observation 1 shows that the equal distribution is admissible, it is not uniquely admissible in general. In Example 1, for instance, severely skewed income distributions are also admissible according to several of the generalizations of expected utility under consideration, and for other sets of utility functions, completely skewed distributions will be admissible. Pre-theoretic fairness-based considerations to reject a completely skewed income distribution may militate against an analysis of fairness in terms of admissibility under ignorance. That is, such observations may well incline some to reject the general decision-making under IP-ignorance approach to evaluating issues of distributive justice—or even the more general philosophical premise mentioned at the beginning of this section that gives ignorance a distinguished role in evaluating issues of impartiality and fairness. And while Observation 2 demonstrates that the unique admissibility of the equal distribution can be secured under certain assumptions about social welfare for certain IP decision rules, for most of the rules under consideration, the equal distribution is not uniquely admissible even relative to the stronger egalitarian assumptions about social welfare made in the observation. The vindication of Lerner’s argument, on such views, awaits vindication of some set of assumptions that imply that the equal distribution is uniquely admissible.

This second sort of reaction, in objecting to the admissibility of completely and severely skewed distributions, makes appeal to a mild procedure-independent notion of fairness. By contrast, pure procedural justice makes no such appeal, defining any distribution resulting from a fair procedure as just—like any holdings resulting from a fair gamble. Still other notions of procedural justice do appeal to independent standards of fairness (e.g., Rawls 1971, §14). If the IP statement of Lerner’s problem is the more appropriate or compelling representation of choice under complete ignorance, one consequence of the critique voiced in the second reaction would seem to be either that unfair distributions can result from this type of fair procedure—compare well-conducted legal trials that reach a wrong verdict—or that choice under ignorance is not a fair procedure.Footnote 14