1 Introduction

In the past few decades, increased use of automated decision-making in general and AI in particular has drawn attention to many philosophical problems, such as the implications of opaque ‘black-box’ like decisions (see, e.g. Fleischmann & Wallace, 2005; Castelvecchi, 2016; Holm, 2019; Zerilli et al., 2019), the possibility of a responsibility gap when machines make decisions autonomously (see, e.g. Matthias, 2004; Johnson, 2015; de Laat, 2018), and the risks of AI—ranging from the mundane to the existential (see, e.g. Müller & Bostrom, 2014; Ord, 2020; Müller & Cannon, 2022).

Among the mundane—but far from unimportant—risks of automated decision-making is unfairness. Over and over again, examples of biased, prejudiced, and discriminatory behaviors have turned up in automated systems. For example, the literature review by Köchling & Wehner (2020) shows that many recruitment systems exhibit bias with respect to gender and ethnicity. Obermeyer et al. (2019) show that (US) systems for prediction of medical needs exhibit large racial biases, since the algorithm uses health care costs as a proxy for illness, but the costs in the training data do not accurately reflect the actual medical needs. Similarly, a literature review by Cavazos et al. (2020) shows that nearly all face recognition algorithms tested are racially biased in their performance. Interestingly, some of these algorithms are better at recognizing faces from their own geographic origins, illustrating the point made by Mittelstadt et al. (2016) that algorithms reflect the values of their designers. Koenecke et al. (2020) find similar results in their analysis of speech recognition systems, and Hankerson et al. (2016) catalog many other similar cases, including a soap dispenser which would dispense soap onto hands of light but not dark color.

As a result, technical AI experts and philosophers alike have spent considerable efforts to understand and mitigate these problems. Interestingly, the more automated decision-support systems are used, and the greater their level of automation, the more can be gained—in terms of better outcomes—from improving such automated procedures (see, e.g., Lee et al., 2019). It is this intimate connection between procedure and outcome which makes the field known as algorithmic fairness concerned with both procedural justice (the process of developing and deploying decision support systems must not be biased against some groups or individuals) and substantive justice (the outcomes of decisions made by or with the help of those systems must not unjustly disadvantage some groups or individuals). A technically oriented review article of the field is Chouldechova & Roth (2020), a more philosophically oriented one is Fazelpour & Danks (2021).

One strand of thought which has received much attention is the prospects for Rawlsian algorithmic fairness, i.e., the application of Rawls’s seminal 1971 work A Theory of Justice to these questions.Footnote 1 Procaccia (2019) calls Rawls “AI’s favorite philosopher”, and it has been argued that the goal to get rid of bias or discrimination “is rooted in Rawlsian ethics” (Procaccia, 2020). This Rawlsian approach to algorithmic fairness is further introduced in Section 2. But while popular, it has been criticized for allowing “loopholes” (Jørgensen & Søgaard, 2023). In previous work (Franke, 2021) we identified a number of complications with Rawlsian algorithmic fairness, but were unable to provide any unified explanation for why these complications occur. The purpose of this article is to offer such an explanation. We delimit ourselves to the assumption that Rawls is broadly right and ask the questions: What does this mean for algorithmic fairness? How should Rawlsian thought be applied in this area? In particular, we focus on the difference principle.

More precisely, we identify what seems to be a root-cause of many of the complications identified: Proposals to achieve Rawlsian algorithmic fairness in the literature (see, e.g., Heidari et al., 2018) often aim to uphold the difference principle in the particular decision-situations where automated decision-making occurs. However, the Rawlsian difference principle applies to society at large—an aggregation of many such situations—and as argued in Section 3, the difference principle does not aggregate in such a way that upholding it in constituent situations also upholds it in the aggregate. But such aggregation is a hidden premise of many proposals for Rawlsian algorithmic fairness in the literature. Having made this point about the missing aggregation property, we briefly discuss in Section 4 how the difference principle could instead be upheld. Finally, Section 5 offers some concluding remarks.

2 Algorithmic Fairness and the Rawlsian Approach

The Rawlsian approach to algorithmic fairness is based on Rawls’s two principles of justice for institutions (Rawls, 1999, p. 266):

  • First principle: Each person is to have an equal right to the most extensive total system of equal basic liberties compatible with a similar system of liberty for all.

  • Second principle: Social and economic inequalities are to be arranged so that they are both:

    1. (a)

      to the greatest benefit of the least advantaged, consistent with the just savings principle, and

    2. (b)

      attached to offices and positions open to all under conditions of fair equality of opportunity.

The principles of justice are lexically ordered, so that the first principle takes precedence. Thus, for example, if a distribution of resources in accordance with part (a) of the second principle, i.e., the difference principle, would somehow lead to citizens not being able to see themselves as free and equal anymore, according to the first principle, this distribution is precluded.

Rawls derives these principles through the thought experiment of the original position (Rawls, 1999, pp. 102–168). Here, people have convened to deliberate how to organize the basic structure of society. The distinguishing feature of the original position is that the parties deliberate behind a veil of ignorance (Rawls, 1999, pp. 118–123), designed to ensure impartiality. Thus, the parties do not know their personal characteristics, their social status, their generation, nor any probabilities for belonging to particular groups, and thus they cannot tailor the organization of society to their own narrow self-interest, only to a broader common interest. The two principles, Rawls argues, are what would emerge from such an hypothetical deliberation. As mentioned above, Rawlsian thought—especially fair equality of opportunity, but also the difference principle—is commonly applied to algorithmic fairness:

For example, the individual notions of algorithmic fairness proposed in work by Dwork et al. (2012) and Joseph et al. (2016)Footnote 2 have been described by Lepri et al. (2018) “as a mathematical formalization of the Rawlsian principle of ‘fair equality of opportunity’” from part (b) of the second principle.

Furthermore, Lee et al. (2021, Table 4) identify the same concept—fair equality of opportunity—as the philosophical origin of no less than six different statistical notions of algorithmic fairness, i.e., notions which typically require parity between algorithmic performance measures for different groups: (1) False Negative Rate (FNR) parity (Hardt et al., 2016), (2) False Positive Rate (FPR) parity (Chouldechova, 2017), (3) equal odds, i.e., simultaneous True Positive Rate (TPR) and True Negative Rate (TNR) parities (Hardt et al., 2016), and (4) Positive Predictive Value (PPV) parity (Chouldechova, 2017) for binary classification problems all originate from the Rawlsian fair equality of opportunity concept. So do (5) positive and (6) negative class balance (Kleinberg et al., 2017), two notions applicable when a binary classification is accomplished through some intermediate scoring mechanism. Then, positive (negative) class balance requires that the average score assigned to members of one group who belong to the positive (negative) class should be the same as the average score assigned to members of another group who belong to the positive (negative) class. For a more thorough discussion of Rawlsian equality of opportunity in machine learning, see Heidari et al. (2019).

A few more concrete examples of how Rawlsian principles are applied in practice are the following: Leben (2017), who develops an algorithm for trolley-style situations based on the Rawlsian original position and the maximin rule. The idea is to have a (self-driving) vehicle first estimate survival probabilities for different courses of actions, then find out which action satisfies the maximin rule (for a critical discussion of this proposal, see Keeling, 2017).

Peng (2020) develops a process to de-bias machine learning algorithms so that they favor those who have been historically disadvantaged. The argument is that much of what is distributed by algorithms are not positions where part b of the second principle, fair equality of opportunity, applies, but rather primary goods, where part a, the difference principle, applies.Footnote 3 Thus, under this proposal, algorithms would be constructed so that they compensate for historical disadvantages by applying the difference principle whenever their outcomes have distributional effects.

Shah et al. (2021) propose and implement a Rawls classifier which can be applied to any black-box deep learning model to minimize the error rate on the worst-off sensitive sub-population. Technically, this allows existing models to be modified in the spirit of the difference principle.

Zhu et al. (2021) develop a way to prevent unfairness in recommendation systems, where new items, lacking history, may get be recommended less often than deserved.

Heidari et al. (2018) propose a technical mechanism to include fairness criteria inspired by Rawls as constraints in the optimization problems solved in machine learning training, thus guaranteeing certain corresponding properties of the resulting trained models.

What these concrete proposals have in common is that they involve technical mechanisms to include fairness criteria as constraints on how programmers should construct algorithms, or as constraints on optimization problems solved in machine learning training, thus guaranteeing or promoting certain corresponding properties of the resulting systems. In the following section, we critically discuss a hidden premise of such proposals which aim to uphold the difference principle.

3 The Difference Principle and Aggregation of Situations

The difference principle mandates that social and economic inequalities are to be arranged so that they are to the greatest benefit of the least advantaged. As we saw in the previous section, within the field of Rawlsian algorithmic fairness, this is typically interpreted as imposing a constraint on the workings of individual decision support systems: they should respect something like the risk-averse maximin rule (Leben, 2017; Peng, 2020; Shah et al., 2021; Zhu et al., 2021).

As a concrete example, consider Heidari et al. (2018), who develop fairness constraints on machine learning in a paper titled “Fairness behind a veil of ignorance: A welfare analysis for automated decision making”:

John Rawls proposes the concept of veil of ignorance as the ideal condition/mental state under which a policy maker can select the fairest among a number of political alternatives. He suggests that the policy maker performs the following thought experiment: imagine him/herself as an individual who knows nothing about the particular position they will be born in within the society, and is tasked with selecting the most just among a set of alternatives. In this hypothetical original/ex-ante position, if the individual is rational, they would aim to minimize risk and insure against unlucky events in which they turn out to assume the position of a low-benefit individual. [...] Our main conceptual contribution is to characterize fairness in the context of algorithmic decision making through the Rawlsian theory of justice: our proposal is for the ML expert wishing to train a fair decision making model (e.g. to decide whether salary predictions are to be made using a neural network or a decision tree) to perform the aforementioned thought experiment [...] To formalize the above, our core idea consists of comparing the expected utility a randomly chosen, risk-averse subject of algorithmic decision making receives under different predictive models. [...] Furthermore and from a computational perspective, our welfare-based measures of fairness are more convenient to work with due to their convex formulation. This allows us to integrate them as a constraint into any convex loss minimization pipeline, and solve the resulting problem efficiently and exactly. (Heidari et al., 2018, emphasis in original. The passage quoted is from the version published by ACM—the version at proceedings.neurips.cc is slightly differently worded.)

In summary, Heidari et al. (2018) thus argue the following: to achieve Rawlsian algorithmic fairness, the difference principle should be upheld in individual decision support systems. To facilitate this, their technical contribution makes it easy for ML experts to integrate convex fairness constraints into whatever systems they develop (such as for salary prediction), so that these systems will uphold the difference principleFootnote 4 with respect to their scope of operation. As more automated decision-making is used in society, if these systems each uphold the difference principle in each particular situation, the difference principle will thus, presumably, be upheld at the aggregate level. However, thus articulated, we see that this position depends on a hidden premise concerning aggregation properties of the difference principle. We now proceed to investigate this premise in greater detail. (For clarity, it should be noted that we do not claim that all theories of algorithmic fairness depend on such aggregation properties, nor that Rawls’s original application of the difference principle does—quite the contrary, as discussed in Section 3.3—only that some applications of the difference principle in the algorithmic fairness literature depend on such aggregation properties.)

3.1 Strong Aggregation

In a strong version, the hidden premise can be articulated as follows:

  • Strong aggregation: The difference principle is upheld at the aggregated level if the difference principle is upheld in the constituent situations.

We first observe that there are indeed circumstances where such aggregation properties hold. For example, egalitarian distributions work like that—if everyone first gets an equal share of something, and then gets another equal share of something else, their aggregated shares will also be equal. Similarly, entitlement theories of distributions, such as the one Nozick (1974, pp. 150–153) famously developed as a contrast to Rawls, also exhibit such an aggregation property—if someone first has acquired something justly and then someone else acquires something else justly, then the resulting aggregate is also just. Importantly, it also seems that Rawls’s first principle—the liberty principle—exhibits such aggregation. If the basic liberties of each person are respected in each situation, then they are also respected in the aggregate of these situations. Since “basic liberties can be restricted only for the sake of liberty” (Rawls, 1999, p. 266), it is for example not the case that in the aggregate of many situations, in each of which liberty is respected, we could observe some failure in another property, such as social and economic inequalities, which would be a legitimate reason to restrict liberty in any of the constituent situations. The first principle has lexical precedence over the second one.

Observe that the aggregation property we are investigating is about sufficiency: if some principle is upheld in the constituent situations, it is also upheld in the aggregate. It is not about necessity; only if. The examples above differ in this respect. In the egalitarian case, egalitarianism in the constituent situations is a sufficient but not a necessary condition for the aggregate to be egalitarian, since inequalities may even out so that aggregate equality is achieved without equality in each constituent situation. In the entitlement case, and in the case of Rawls’s first principle, however, justice in each constituent situation is both sufficient and necessary for the aggregate to be just.

However, the difference principle does not have this aggregation property. Strong aggregation is false, as can be proven by a simple counterexample:

Example 1

Let the goods allocated to individual i be denoted by the ith component of a distribution vector (so that (4,3) means that individual 1 gets 4 and individual 2 gets 3), let \(x \prec _{\textrm{DP}} y\) denote that y is preferred to x by the difference principle, and let aggregation of situations be component-wise addition (so that if an individual gets 1 in one situation and 2 in another situation, this individual gets \(1+2=3\) in the aggregation of these situations). Then we have the following counterexample to Strong aggregation:

$$\begin{aligned} {\begin{matrix} (1,2) & \prec _{\textrm{DP}} (1,1) \\ + (3,1) & \prec _{\textrm{DP}} (1,1) \\ \hline (4,3) & \succ _{\textrm{DP}} (2,2) \end{matrix}} \end{aligned}$$

In the first situation—the first row—we compare two possible distributions (which can be seen as the foreseeable results of two different systems being designed). To the left, there is a distribution where individual 1 gets 1, and individual 2 gets 2. To the right, there is a distribution where both individuals get 1. Now, the additional goods to individual 2 in the left-hand distribution do not benefit individual 1, who is the least advantaged under the left-hand distribution. More precisely, the greater inequality under the left-hand distribution does not make the least advantaged better off than under the egalitarian right-hand distribution—individual 1 still gets only 1. Thus, the right-hand distribution is preferred by the difference principle.

Proceeding to the second situation—the second row—we again compare two possible distributions between the two individuals (or groups). Again, the left-hand distribution is an unequal one, this time benefiting individual 1. But as before, the additional goods to individual 1 do not benefit individual 2, who is the least advantaged under the left-hand distribution, so the right-hand distribution is preferred here as well.

However, it is also possible to take a step back and consider the result of these preferences from the aggregated perspective. What is the overall distribution resulting from the situations? Summing the goods obtained by individual 1 in the distributions preferred by the difference principle in the first and second situations, we get 2, and the goods obtained by individual 2 also sum to 2. But summing the goods attained by the two individuals in the distributions not preferred by the difference principle, it turns out that both of them get more; individual 1 gets 4 and individual 2 gets 3. Moreover, considered as such an aggregate, this unequal left-hand distribution benefits individual 2, who is the least advantaged under this distribution. For individual 2 gets 3 under the left-hand distribution compared to only 2 under the right-hand distribution. Thus, the aggregated left-hand distribution is preferred by the difference principle, even though if considered one situation at a time, the right-hand distributions are preferred.

3.2 Moderate Aggregation

However, even though Strong aggregation is false, proposals for Rawlsian algorithmic fairness that aim to uphold the difference principle in each situation do not need Strong aggregation to be plausible. A weaker version may be sufficient:

  • Moderate aggregation: The difference principle is usually upheld at the aggregated level if the difference principle is upheld in the constituent situations.

Moderate aggregation is an empirical claim, and we cannot conclusively prove or disprove it without empirical investigation. However, it is possible to reason a bit about its plausibility. Consider another example, a probabilistic one:

Example 2

Let the goods allocated to an individual in a situation i be determined by a lottery \(L_i(x_1, p_1; \ldots ; x_n, p_n)\), such that \(\sum _{j=1}^n p_j=1\), where the individual receives each \(x_j\) with probability \(p_j\). When applied to a population of many individuals, such a lottery generates a distribution of goods over the population. Two such distributions (which can again be seen as the foreseeable results of two different systems being designed) can be compared under the difference principle.Footnote 5 In such a situation, comparing population level distributions generated by two different lotteries, we have for example:

$$\begin{aligned} {\begin{matrix} L_1 = L(1,\frac{1}{5};2,\frac{3}{5};5,\frac{1}{5})&\prec _{\textrm{DP}} L_2 = L(1,\frac{1}{5};2,\frac{3}{5};3,\frac{1}{5}) \end{matrix}} \end{aligned}$$

To make the numbers concrete, in a population of 1 000, on average one fifth—200 people—would receive 1, three fifths—600 people—would receive 2, and one fifth—200 people—would receive 5 under the left-hand distribution. By contrast, under the right-hand distribution, on average one fifth—200 people—would receive 1, three fifths—600 people—would receive 2, and one fifth—200 people—would receive 3. Now, the least advantaged fifth (receiving 1) are not better off in the left-hand lottery where the most advantaged fifth receives more (5) than in the right-hand lottery where the most advantaged fifth receives less (3), so the greater inequality to the left cannot be motivated by benefiting the least advantaged. Thus, the right-hand lottery is preferred by the difference principle. The fact that the arithmetic mean of the left-hand lottery (2.4) is greater than that of the right-hand lottery (2.0) does not matter from the point of view of the least advantaged. Now, instead consider an aggregate of ten independent such lotteries. It is plausible to hold the following:

$$\begin{aligned} {\begin{matrix} \sum _{i=1}^{10} L_{1,i}&\succ _{\textrm{DP}} \sum _{i=1}^{10} L_{2,i} \end{matrix}} \end{aligned}$$

Just as in Example 1, the aggregated left-hand distribution is preferred by the difference principle, even though if considered one situation at a time, the right-hand distributions are preferred. For the intuition behind this claim, consider the following simulated distributions over outcomes in Fig. 1. Not only is the average outcome for the \(L_1\)s (close to \(10 \cdot 2.4\)) better than for the \(L_2\)s (close to \(10 \cdot 2.0\)), but—and this is what matters to the difference principle—the least advantaged are better off. The number of people in a population of 1 000 ending up with the lowermost outcomes (14, 15, 16, 17) is smaller for the aggregation of \(L_1\)s than for the aggregation of \(L_2\)s. The difference may not be great, but the aggregation of \(L_1\)s seems better for the least advantaged.

Fig. 1
figure 1

Histograms of 1 000 simulated outcomes of the lottery aggregation \(\sum _{i=1}^{10} L_{1,i}\) (top,  ) and 1 000 simulated outcomes of the lottery aggregation \(\sum _{i=1}^{10} L_{2,i}\) (bottom,  )

Though the numbers chosen in Example 2 are arbitrary, and the exact outcomes of the simulations depicted are subject to chance (another run would yield somewhat different distributions) the overall tendency is, of course, not a coincidence. It is well known that for sums of independent stochastic variables \(X_i\) with the same expected value \(\mu \) and standard deviation \(\sigma \), the expected value E and standard deviation D are as follows:

$$\begin{aligned} E \left( \sum _{i=1}^n X_i \right) = n \mu & D \left( \sum _{i=1}^n X_i \right) = \sigma \sqrt{n} \end{aligned}$$
(1)

The important insight here is that whereas the expected value grows with the number of variables summed, the standard deviation grows only with the square root of the number of variables summed, i.e., slower. Since the standard deviation is a measure of statistical dispersion, this means that as more terms are summed, the distribution becomes relatively more concentrated around the expected value for each additional term.

Using, as is common, the standard deviation as a measure of statistical dispersion, the two aggregated lotteries can be roughly characterized by their expected values and standard deviations. A ‘typical’ outcome of an aggregated lottery is its expected value plus/minus a few standard deviations. Such a characterization yields an explanation of why \(L_2\) is preferred to \(L_1\) as a single instance, but the aggregate of \(L_2\)s is not preferred to the aggregate of \(L_1\)s:

$$\begin{aligned} {\begin{matrix} E(L_1) \pm D(L_1) \approx 2.4 \pm 1.36 & \prec _{\textrm{DP}} E(L_2) \pm D(L_2) \approx 2 \pm 0.63\\ E \left( \sum _{i=1}^{10} L_{1,i} \right) \pm D \left( \sum _{i=1}^{10} L{1,_i} \right) = 24 \pm 4.29 & \succ _{\textrm{DP}} E \left( \sum _{i=1}^{10} L_{2,i} \right) \pm D \left( \sum _{i=1}^{10} L_{2,i} \right) = 20 \pm 2 \end{matrix}} \end{aligned}$$

In particular, if we consider a typical bad outcome to be the expected value minus one standard deviation, then in the single instance \(L_2\) fares better than \(L_1\) (\(2-0.63=1.37 > 2.4-1.36 = 1.04\)), but in the aggregate, the \(L_2\)s fare worse than the \(L_1\)s (\(20 - 2 = 18 < 24 - 4.29 = 19.71 \)).

Using expected value and standard deviation (i.e., the two first moments of the probability distribution) is a common and often useful simplification, even though it cannot formally reflect all possible utility functions (see, e.g, Varian, 1992, p. 371). To understand why these two go a long way towards capturing risk-aversion such as the Rawlsian maximin principle, consider the Chebyshev inequality:

$$\begin{aligned} P \left( |X-\mu | \ge k \sigma \right) \le \frac{1}{k^2} \end{aligned}$$
(2)

In words, the Chebyshev inequality offers an upper bound on how much probability mass can be outside an interval of 2k standard deviations, centered at the expected value \(\mu \). An interval of two (\(k=1\)) standard deviations includes at least half of all outcomes; an interval of four (\(k=2\)) standard deviations includes at least three quarters of all outcomes, etc. Furthermore, for most distributions, this is a gross underestimation of how much probability mass actually falls within the interval—equality is possible only for two-point distributions.

At this stage, however, it is reasonable to ask whether it is appropriate to use the standard deviation as our measure of dispersion, thus implicitly defining the least advantaged as those who fall at one, two, or some other multiple k standard deviations from the arithmetic mean. In particular, why not use the minimum instead, as the term ‘maximin’ seems to suggest? For example, revisiting Example 2, this worst possible case is 1 in a single instance of \(L_1\) and \(L_2\) alike, and it is \(10 \cdot 1\) in ten aggregated instance of \(L_1\) and \(L_2\) alike.

But Rawls’s least advantaged is not “a low-benefit individual” (in the phrase of Heidari et al., 2018) but rather a group, viz. those who are the least fortunate with respect to (i) family and class, (ii) natural endowments, and (iii) fortune and luck, but “all within the normal range” (Rawls, 1999, p. 83), i.e., removing the most extreme cases from consideration. Why is this so? An important part of the answer is that Rawls (1999, p. 84) does not want to “distract our moral perception by leading us to think of persons distant from us whose fate arouses pity and anxiety”. Exactly how large this Rawlsian ‘normal range’ should be is of course debatable, but to construe it as some multiple k standard deviations from the arithmetic mean seems natural and in line with Rawls’s theory.

The probability of anyone receiving the worst possible outcome (1) in a single instance of \(L_1\) or \(L_2\) is \(\frac{1}{5}\), i.e., a considerable chance. Thus, it seems reasonable to assess this outcome as being within the Rawlsian ‘normal range’. Since this outcome is the same in the two lotteries, including it in the assessment underpins the judgment that \(L_2\) is preferable under the difference principle—the greater inequality of \(L_1\) is not to the benefit of the least advantaged.

However, the probability of anyone receiving the worst possible outcome (10) when aggregating 10 independent instances of \(L_1\) or \(L_2\) is \(\left( \frac{1}{5} \right) ^{10} \approx 10^{-7}\), i.e., one in ten million.Footnote 6 Thus, it seems reasonable to assess this outcome as not being within the Rawlsian ‘normal range’ and its persistence in both lotteries is not an argument against the difference principle preferring the aggregate of ten \(L_1\)s to the aggregate of ten \(L_2\)s.

Note that in Example 2, the lotteries are independent, so that the expressions for variance and standard deviation given in Eq. (1) hold. With many independent lotteries, each individual outcome becomes less important, because they tend to even out in the long run. However, it is instructive to also consider a situation with correlations. Recall that when adding stochastic variables in the general case, not only variances but also covariances must be added:

$$\begin{aligned} V(X+Y) = V(X) + V(Y) + 2C(X,Y) \end{aligned}$$
(3)

The standard deviation, as usual, is found by taking the square root of the variance.

Example 3

Let individual instances of \(L_1\) and \(L_2\) be as before, but drop the assumption of independence between situations, and instead let lotteries \(L_{1,i}\) and \(L_{1,j}\) have non-zero covariance \(C_1\) and lotteries \(L_{2,i}\) and \(L_{2,j}\) have non-zero covariance \(C_2\). Then, as before, \(L_1 \prec _{\textrm{DP}} L_2\) as individual instances, but what happens in the aggregate depends on \(C_1\) and \(C_2\). For simplicity, consider aggregating just two lotteries. If, as before, we use the expected value minus one standard deviation as our guide, then we must compare \(2.4 + 2.4 - \sqrt{1.84 + 1.84 + 2C_1}\) with \(2 + 2 - \sqrt{0.4 + 0.4 + 2C_2}\) and find the greatest of the two. The outcome hinges on the particular values of \(C_1\) and \(C_2\). Note that a negative covariance makes the variance of the aggregate smaller (similar to Example 1), whereas a positive covariance makes the variance of the aggregate greater.

Example 3 illustrates the importance of covariance. Though its exact impact clearly depends on how the problem is formalized and mathematically modeled, the fact that combining situations with negative covariance decreases statistical dispersion (to zero, in the limit) shows its importance.

Recall that we are assessing the claim made in Moderate aggregation, that upholding the difference principle in constituent situations is usually sufficient to uphold it in the aggregate. Though this is an empirical claim, the evidence from the examples given weigh against it. More precisely: in cases with many independent (or at least uncorrelated) situations, applying the difference principle in each situation and selecting a lottery with lower expected value because of its smaller statistical dispersion may not uphold the difference principle in the aggregate, because the aggregate expected value grows faster than the aggregate standard deviation. In cases with correlated situations, applying the difference principle in each situation and selecting a lottery with lower expected value because of its smaller statistical dispersion forfeits not only the greater expected value, but also the possibility to use a greater statistical dispersion as a counterbalance to another great statistical dispersion in another situation. Thus, we reject Moderate aggregation.

3.3 Additional Perspectives on Aggregation

The insight that the difference principle lacks an aggregation property should not be new. Going back to Rawls, he emphasizes that the difference principle should not be applied on a case-by-case basis, but on the basic structure of society:

The situation where someone is considering how to allocate certain commodities to needy persons who are known to him is not within the scope of the principles. They are meant to regulate basic institutional arrangements. We must not assume that there is much similarity from the standpoint of justice between an administrative allotment of goods to specific persons and the appropriate design of society. Our common sense intuitions for the former may be a poor guide to the latter. (Rawls, 1999, p. 56)

Rawls clearly envisions upholding the difference principle in other ways than by upholding it in particular situations and aggregate from there.

That the difference principle lacks aggregation properties is also at least implicit in the observation made by Nozick (1974, pp. 160–164) about the difficulty of upholding what he calls patterned principles of distributive justice. Patterns such as a distribution to the benefit of the least advantaged always risk being upset by voluntary actions undertaken in subsequent situations, and may thus require constant redistribution. The normative implications of this are out of scope here—as mentioned in the introduction, we are investigating algorithmic fairness on the assumption that Rawls is broadly right—but we still note Nozick’s descriptive observation: patterned principles such as the difference principle are not in general self-sustaining in the sense that they are upheld in the aggregate if upheld in every individual instance.

An observation explicitly about the difference principle’s lack of aggregation properties is made by Schmidtz :

Second, although the principle may apply to many “abstract” possibilities, it does not apply to case-by-case redistribution. The difference principle applies only to a choice of society’s basic structure. Is this restriction of scope ad hoc, as Rawls’s critics often say? No! Why not? Because applying the difference principle to every decision, as if Joe should never earn or spend a dollar unless he can prove that doing so is to the greatest benefit of the least advantaged, would cripple the economy, hurting everyone, including the least advantaged. The difference principle rules out institutions that work to the detriment of the least advantaged, including ones that overzealously apply the difference principle to the detriment of the least advantaged. There is nothing ad hoc about this constraint. It derives straightforwardly from the difference principle itself. (Schmidtz, 2006, pp. 189–190, emphasis in original).

Since the examples have led us to consider aggregates of stochastic outcomes, it is instructive to also relate to financial economics, where structurally similar problems have long been studied both theoretically and empirically. Though it is out of scope to review all the different theories thoroughly, it is highly relevant to consider the following short description from a classic textbook:

This analysis involves considerations of general equilibrium since the value of a risky asset inherently depends on the presence or absence of other risky assets which serve as complements or substitutes with the asset in question. Therefore, in most models of asset pricing, the value of an asset ends up depending on how it covaries with other assets. What is surprising is how generally this insight emerges in models that are seemingly very different. (Varian, 1992, p. 370, emphasis in original)

To spell out what is being said, note that general equilibrium means precisely that each situation/asset cannot be assessed on its own, but that the entire aggregate (investment portfolio) must be considered together. The aggregation property does not hold. Thus, it is tempting to paraphrase Varian and say that how the difference principle, in the aggregate, is best served in each particular risky situation requires an holistic approach, since the role of each risky situation depends on the presence or absence of other risky situations.

Though we have rejected the aggregation property with respect to the difference principle, it does not follow that there is no such aggregation property in broader Rawlsian algorithmic fairness, which includes both principles of justice. Recall that the principles are lexically ordered and that the first principle—the liberty principle—takes precedence. As pointed out above, this principle exhibits a strong aggregation property: if the equal basic liberties of each person are not violated in any one situation, then the aggregate of these situations also does not violate these liberties. Indeed, just like in the entitlement theory mentioned above, upholding Rawls’s first principle in the constituent situations is both sufficient and necessary for upholding the first principle in the aggregate.

It is also important to bear in mind that the lack of an aggregation property matters only when situations can indeed be aggregated, so that a bad outcome in one situation can be (over-)compensated for by a good outcome in another. For example, not being interviewed for one job can be compensated for by being interviewed for another one, not being granted a loan by a bank can be compensated for by being granted one from a competitor, and getting a bad suggestion for a book to read can can be compensated for by getting a good suggestion for another one. But if situations cannot be aggregated, perhaps because they only occur once or because they represent truly incommensurable goods (for an introduction to incommensurable values, see Hsieh & Andersson 2021; for a classic defense of ‘spheres of justice’ between which redistribution does not make sense, see Walzer 1983) then the lack of an aggregation property does not matter. If the difference principle is to be applied in such situations, it has to be applied to them separately, because there is no (reasonable) aggregate.

4 Upholding the Difference Principle in other Ways

In the previous section, we investigated and rejected the aggregation property of the difference principle: it is not the case that if the difference principle is upheld in each particular situation, it is also upheld at the aggregate level. Upholding it in each particular situation is not sufficient to also uphold it at the aggregate level. But we could also ask whether it is necessary. Given the evidence, it seems that a better way to uphold the difference principle would be to redistribute goods ex post, when the picture of the aggregate emerges, rather than second guessing it ex ante in all the constituent situations.Footnote 7

Hedden (2021), in his critique of statistical notions of algorithmic fairness makes a similar remark. Though it does not explicitly pertain to Rawls or the difference principle, the general point seems equally valid in our context:

The conceptual point is this: When a predictive algorithm is used to make decisions with distributional consequences or other effects that we deem unfair or unjust, this does not mean that the algorithm itself is unfair or biased against individuals in virtue of their group membership. The unfairness or bias could instead lie elsewhere: with the background conditions of society, with the way decisions are made on the basis of its predictions, and/or with various side effects of the use of that algorithm, such as the exacerbation of harmful stereotypes. The practical point is that, as a result, the best response may sometimes be not to modify the predictive algorithm itself, but to instead intervene elsewhere, by changing the background conditions of society (e.g., through reparations, criminal justice reforms, or changes in the tax code), by modifying how we act on the basis of the algorithm’s predictions (e.g., by adopting different risk thresholds for different groups, above which we deny bail, or reject a loan application, and so on), or by attempting to mitigate the other negative side effects of the algorithm’s use. Hedden (2021)

‘Intervening elsewhere’ is indeed an important part of the Rawlsian toolbox—most often this is probably a much better way to uphold the difference principle than to modify each individual decision-making situation.

5 Conclusions

The Rawlsian approach to algorithmic fairness is a popular one. However, it also comes with complications. The argument presented in Section 3 suggests that the root-cause of many of these difficulties is the hidden premise that the difference principle is upheld at the aggregated level if it is upheld in the constituent situations. The falsity of this premise, we propose, is a good explanation of the complications we identified in previous work (Franke, 2021).

First, attitudes to risk. Only in some situations—typically, irrevocable choices with very high stakes—are there good reasons to adopt the risk-averse maximin rule. In other—more mundane—situations, risk-neutrality or even risk-seeking may be appropriate. This observation suggests that proposals for Rawlsian algorithmic fairness that apply the difference principle in every situation are misguided (Franke, 2021, Section 2). The falsity of the aggregation premise neatly explains what goes wrong in such proposals. If Rawls is right, the difference principle applies to the basic structure of society. The fact that other attitudes to risk are appropriate in particular situations is not a problem, as long as this occurs within this basic structure. The difference principle can be upheld in the aggregate through other mechanisms, such as redistribution.

Second, the scope of stakeholders. Proposals for Rawlsian algorithmic fairness sometimes delimit very narrow sets of stakeholders to be considered (e.g., the people who are classified as false negatives, true positives, true negatives, and false positives, respectively), whereas the set of stakeholders in Rawls’s original position is in fact much broader—in a sense, everyone (Franke, 2021, Section 3). Again, the falsity of the aggregation premise explains where the tension comes from. For a particular constituent situation, the set of stakeholders may indeed be plausibly delimited. But since upholding the difference principle in these situations does not entail upholding it at the aggregated level, such delimited sets of stakeholders appear seriously inadequate from the aggregated point of view. The same explanation pertains to the related complication with defining the least advantaged (Franke, 2021, Section 4).

Third, knowledge about probabilities. Proposals for Rawlsian algorithmic fairness which aim to uphold the difference principle in particular situations face a dilemma with respect to knowledge about probabilities: either (i) disregard relevant statistical information of the kind often considered to be at the core of algorithmic fairness, or (ii) incorporate this information and abandon the Rawlsian veil of ignorance for a (thinner) veil of uncertainty (Franke, 2021, Section 5). As before, the falsity of the aggregation premise explains the origin of this dilemma—it occurs because the difference principle is being applied to particular situations, and the falsity of the aggregation premise suggests that this is misguided.

In this sense, the missing aggregation property sheds explanatory light on the promise and peril of Rawlsian algorithmic fairness.

Having made the observation that the difference principle is sometimes applied to what seems to be the wrong situations, we can also note that this is in line with other comments in the literature. Wong (2019) makes a similar observation, arguing that researchers have taken algorithmic bias seriously but primarily conceptualized it as a technical task, while it should rather, first and foremost, be conceptualized as a political question. If Wong’s observation is right, it is not surprising that there are many proposals in the literature where the difference principle is applied to particular situations, because this is most often where it is technically feasible to apply it. By contrast, applying it to the basic structure of society is beyond what is technically feasible, precisely because of the missing aggregation property.

The conclusion that Rawlsian algorithmic fairness does not require the difference principle to be upheld in each and every situation, but rather in the aggregate, has considerable practical relevance. Both vendors building AI solutions and organizations procuring them are currently making efforts to achieve algorithmic fairness inspired by Rawls. They would be well advised not to proceed on naïve assumptions about how the difference principle aggregates. Rawls’s first principle—the liberty principle—should be upheld in each and every situation, but the difference principle should not. If Rawls is broadly right and the difference principle should be upheld, it should be applied as intended, to the basic structure of society.