In the last three parts of the dissertation, we have dissected discrimination. First of all, we said that two requirements have to be fulfilled in order that an act is discriminatory: (1) In the decision situation, there has to be a differentiation between two or more things/people. (2) At least one of these things/people has to be treated in a systematically different way compared to the other things/people.Footnote 1 This definition is indeed very general which is why we from then on concentrated on different treatment of people or groups, which we named social discrimination. Here, we distinguished two types of discrimination: taste-based discrimination and statistical discrimination. While the former is possible in any kind of decision-making, the latter can only occur in decision-making under uncertainty. Next, we examined the psychological mechanisms behind taste-based discrimination, whether such tastes actually exist, and how they could have evolved. Ultimately, we investigated how we get our beliefs based on which we form subjective probabilities of possible scenarios.

This last chapter shall reassemble these dissected components of discrimination and then analyse how this understanding of discrimination can contribute to the discourse presented in the introduction. Therefore, we first put the findings of this dissertation into a summarising model. Then, we look at what implications it has for a normative theory of discrimination.

5.1 A Descriptive Model of Discrimination

In order to summarise the preceding deliberations in a model, we have to interconnect two perspectives: What type are the decision-maker’s preferences and how does the decision-maker get/form beliefs. Concerning the type of preferences, we have to differentiate between agent-neutral and agent-relative preferences. This is because only the latter lead to taste-based discrimination. Since we have already thoroughly discussed agent-neutral and agent-relative preferences, we will not again enlarge upon these topics here.

Regarding the formation of beliefs, we have to distinguish three circumstances: (1) The formation of our beliefs is irrelevant because we do not need them so as to form subjective probabilities in the first place. This is the case in decision-making under certainty. It is important to notice that these have to be correctly recognised situations of certainty. This excludes the possibility that the decision-maker is actually confronted with uncertainty, yet, assigns a subjective probability of 1 to one scenario, which out of his perspective suggests certainty. We have to exclude such a situation because even though the decision-maker thinks that they are independent of any subjective probabilities, these very probabilities make him mistake uncertainty for certainty. Likewise, the other way around is also possible: A decision-maker thinks that a decision underlies uncertainty although it actually underlies certainty. In this case, he again makes use of subjective probabilities which is why we exclude such a situation from this first distinction as well.

Now, it might be objected that ultimately the correct understanding that a given decision underlies certainty has to again base on the decision-maker’s beliefs. This is of course true. Yet, in such a situation, the respective beliefs which correctly indicate that a decision underlies certainty are not subjectively formed but objectively given. As a result, it is irrelevant how a decision-maker forms his beliefs because this process does not influence decision-making under certainty. Admittedly, in practice, it is questionable how often the idea of objectively given beliefs applies. It could be even argued that in the end all beliefs and thereby all probabilities are subjective (cf. Savage, 1954). If that were true, this first distinction could be ignored and we would directly start with the second one.

(2) The formation of our beliefs adheres to objective Bayesianism. This means two things. First, when confronted with new evidence we update our beliefs employing Bayes’ law. Second, in lack of any evidence for how probable different scenarios are, we use a uniform prior. As a consequence, there are no inherent prior beliefs. Or strictly speaking, there are only two inherent prior beliefs, namely, in the absence of any evidence we use a uniform prior and update our priors according to Bayes’ law. Finally, all other belief formation methods which, regarding beliefs that are directly or indirectly linked to social categories, lead to the exact same results as objective Bayesianism are also part of this distinction. So, concerning discrimination, they are equivalent to objective Bayesianism which is why we from now on class them among objective Bayesianism.

(3) The formation of our beliefs adheres to subjective Bayesianism or any non-Bayesian method. As we know, subjective Bayesianism allows any prior beliefs in a decision situation that lacks prior evidence as long as they fulfil the three assumptions of probability theory (cf. Kolmogorov, 1933). So, this is where inherent prior beliefs can come into play. The same is true for non-Bayesian belief formation methods.Footnote 2 Additionally, these methods (partly) deviate from Bayes’ law in regard to their belief updating process. Due to that subjective Bayesianism and non-Bayesianism can lead to any possible belief despite substantial disconfirming evidence. As a result, under these conditions it seems to be pointless to describe a belief as rational or irrational which is why we characterise such beliefs as biased. In turn, these biased beliefs than lead to biased statistical discrimination.Footnote 3

Figure 5.1
figure 1

Descriptive model of discrimination

Figure 5.1 presents the respective intersections of the two types of preferences and the three distinctions regarding the formation of beliefs. This leads to six cases which we will individually discuss in the following pages. Note that the top left “field” reminds us that the model is always surrounded by a certain learning environment. Therefore, the specific beliefs someone learns not only depend on his belief formation process but also his learning environment.

No Discrimination Regarding Social Categories

There is only one situation where there certainly is no discrimination regarding social categories and therefore no social discrimination: When the decision-maker has agent-neutral preferences and the decision that he has to take underlies certainty (and he knows that). In case of certainty, there is non-discrimination regarding social categories in a situation where providers offer the same characteristics \(i\) if:

$$\forall {x}_{i}^{{\mathcal{M}}_{a}},{x}_{i}^{{\mathcal{M}}_{b}}\in X:u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)=u\left({x}_{i}^{{\mathcal{M}}_{b}}\right)$$

Although agent-neutral preferences do not allow discrimination regarding social categories if there is certainty, they still enable non-social discrimination.Footnote 4 This is actually true for all intersections in Figure 5.1, yet, we will only write it our here and in the next intersection which involves taste-based discrimination. In case of decision-making under certainty, we get the following formulation if alternatives have two differing characteristics and a decision-maker prefers characteristics \(i\) to characteristics \(j\) while being indifferent between what group the provider of these characteristics belongs to:

$${\exists !x}_{i},{x}_{j}\in \mathcal{X}:u\left({x}_{i}\right)>u({x}_{j})$$
$$\begin{gathered} \wedge \forall {x}_{i}^{{\mathcal{M}}_{a}},{x}_{i}^{{\mathcal{M}}_{b}},\mathrm{ }{x}_{j}^{{\mathcal{M}}_{a}},{x}_{j}^{{\mathcal{M}}_{b}}\in X:u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)>u\left({x}_{j}^{{\mathcal{M}}_{a}}\right)\wedge u\left({x}_{i}^{{\mathcal{M}}_{b}}\right)>u\left({x}_{j}^{{\mathcal{M}}_{b}}\right) \\ \wedge u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)>u\left({x}_{j}^{{\mathcal{M}}_{b}}\right)\wedge u\left({x}_{i}^{{\mathcal{M}}_{b}}\right)>u\left({x}_{j}^{{\mathcal{M}}_{a}}\right)\wedge u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)\\=u\left({x}_{i}^{{\mathcal{M}}_{b}}\right)\wedge u\left({x}_{j}^{{\mathcal{M}}_{a}}\right)=u\left({x}_{j}^{{\mathcal{M}}_{b}}\right) \end{gathered}$$

Taste-Based Discrimination

Given the decision-maker deals with certainty and has agent-relative preferences, he will act in a taste-based discriminatory way. There is taste-based discrimination if the knowledge of who the providers of the alternatives’ characteristics are: (a) leads to a preference of one alternative over another even though they have the same characteristics; and/or (b) changes preferences compared to a situation where providers are unknown.

In the previous chapters we differed between two types of taste-based discrimination, namely a weak and a strong version. While in case of strong taste-based discrimination the decision-maker is willing to bear costs so as to be a taste-based discriminator, this does not apply in case of weak taste-based discrimination. We will only illustrate the difference between the two versions in this intersection and refrain from it in the other two intersections that involve taste-based discrimination.Footnote 5 Moreover, we will only formalise taste-based discrimination in regard to provider situations.Footnote 6 There is weak taste-based discrimination in a situation where alternatives have two differing characteristics and the decision-maker prefers characteristics \(i\) to characteristics \(j\) if:

$$\begin{gathered}\exists {x}_{i}^{{\mathcal{M}}_{a}},{x}_{i}^{{\mathcal{M}}_{b}},{x}_{j}^{{\mathcal{M}}_{a}}\in X:u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)>u\left({x}_{j}^{{\mathcal{M}}_{a}}\right)\wedge u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)\\>u\left({x}_{i}^{{\mathcal{M}}_{b}}\right)\wedge u\left({x}_{j}^{{\mathcal{M}}_{a}}\right)<u\left({x}_{i}^{{\mathcal{M}}_{b}}\right)\end{gathered}$$

In contrast to that there is strong taste-based discrimination in a situation where alternatives have two differing characteristics and the decision-maker prefers characteristics \(i\) to characteristics \(j\) if:

$$\begin{gathered} \exists {x}_{i}^{{\mathcal{M}}_{a}},{x}_{i}^{{\mathcal{M}}_{b}},{x}_{j}^{{\mathcal{M}}_{a}}\in X:u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)>u\left({x}_{j}^{{\mathcal{M}}_{a}}\right)\wedge u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)\\>u\left({x}_{i}^{{\mathcal{M}}_{b}}\right)\wedge u\left({x}_{j}^{{\mathcal{M}}_{a}}\right)\ge u\left({x}_{i}^{{\mathcal{M}}_{b}}\right) \end{gathered}$$

Statistical Discrimination

The fact that the formation of our beliefs is relevant implies that the decision situation involves uncertainty. In this intersection, we have two assumptions. (1) The way we form and update beliefs adheres to objective Bayesianism or any equivalent method that fulfils the requirements stated at the beginning of this chapter. Therefore, we only have group unspecific inherent prior beliefs and these beliefs exclusively contain objective Bayesianism. This is indicated by \({\beta }_{{\theta }_{OB}^{\gamma }}\) and the absence of \({\beta }_{{\mu }^{\gamma }}\). (2) We have agent-neutral preferences. These two factors (might) lead to pure statistical discrimination, as we see in the following formulations, which display pure statistical discrimination in a situation where providers offer the “same” characteristics. We first exclude taste-based discrimination.

$$\begin {aligned}\forall {f}_{i}^{{\mathcal{M}}_{a}},{f}_{i}^{{\mathcal{M}}_{b}}\in F:\sum _{{\mathfrak{i}} = 1}^{n}{{\mathfrak{q}}}_{{\mathfrak{i}}}\left({\beta }_{{\theta }_{OB}^{\gamma }},{\beta }_{{\theta}^{\lambda }},A\right)u\left({f}_{i}^{{\mathcal{M}}_{a}}\left({s}_{\mathfrak{i}}\right)\right) \\= \sum _{\mathfrak{i} = 1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{{\theta }_{OB}^{\gamma }},{\beta }_{{\theta }^{\lambda }},A\right)u\left({f}_{i}^{{\mathcal{M}}_{b}}\left({s}_{\mathfrak{i}}\right)\right) \end{aligned}$$

Second, we look whether learned group specific beliefs affect the decision-maker’s subjective probabilities. If this were not the case or the changes still lead to the exact same preferences, there would be no discrimination regarding social categories. Otherwise, the decision-maker makes use of statistical discrimination, which leads to the following preferences:

$$\begin{gathered}\exists f_{{i^{\text{*}}}}^{{{\mathcal{M}}_a}},f_{{i^{\text{*}}}}^{{{\mathcal {M}}_b}} \in F:\sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_{\mathfrak{i}}}} \left( {{\beta _{\theta _{OB}^\gamma }},{\beta _{{\theta ^\lambda }}},{\beta _{{\mu ^\lambda }}},A} \right)u\left( {f_{{i^{\text{*}}}}^{{{\mathcal {M}}_a}}\left( {{s_{\mathfrak{i}}}} \right)} \right) \\ >\sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{OB}^\gamma }},{\beta _{{\theta ^\lambda }}},{\beta _{{\mu ^\lambda }}},A} \right)u\left( {f_{{i^{\text{*}}}}^{{{\mathcal {M}}_b}}\left( {{s_{\mathfrak{i}}}} \right)} \right)\end{gathered}$$

For repetition, if there is statistical discrimination, characteristics of an alternative and the group membership of its provider can no longer be separated. This is signalised through a little star (*) next to the alternative’s characteristics.

Taste-Based and Statistical Discrimination

As in a situation of pure statistical discrimination, the decision-maker forms and updates his beliefs according to objective Bayesianism. However, in contrast to pure statistical discrimination, he does not have agent-neutral but agent-relative preferences. For example, a decision-maker has agent-relative preferences in a situation where providers offer the “same” characteristics if:

$$\begin{gathered}\exists f_i^{{{\mathcal {M}}_a}},f_i^{{{\mathcal {M}}_b}} \in F:\sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{OB}^\gamma }},{\beta _{{\theta ^\lambda }}},A} \right)u\left( {f_i^{{{\mathcal {M}}_a}}\left( {{s_\mathfrak{i}}} \right)} \right) \\> \sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{OB}^\gamma }},{\beta _{{\theta ^\lambda }}},A} \right)u\left( {f_i^{{{\mathcal {M}}_b}}\left( {{s_\mathfrak{i}}} \right)} \right)\end{gathered}$$

Additionally, the decision-maker might also use group specific beliefs, leading to a possible combination of taste-based and statistical discrimination. This can result in different situations. On one hand, group specific beliefs might not noticeably change preferences. Here, we would only speak of taste-based discrimination. On the other hand, group specific beliefs might significantly increase (decrease) the expected utility of the alternative whose provider is a member of the dispreferred (preferred) group, which changes preferences. This can lead to two possible outcomes, which both are a combination of taste-based and statistical discrimination. Either the decision-maker no longer prefers the alternative of the preferred group to that of the dispreferred group but is indifferent between the two, or he now even prefers that of the dispreferred group. Section 2.3 discussed these different situations in detail, which is why we do not further go into them here. In contrast, given there are no (relevant) group specific beliefs, the decision-maker solely is a taste-based discriminator and does not display statistical discrimination.

Biased Statistical Discrimination

In case of biased statistical discrimination, the decision-maker has agent-neutral preferences and forms/updates his beliefs according to subjective Bayesianism or any non-Bayesian method, which is indicated by \({\beta }_{{\theta }_{SNB}^{\gamma }}\). In case of subjective Bayesianism all kinds of inherent prior beliefs are possible including group specific ones (\({\beta }_{{\mu }^{\gamma }}\)). There are only two requirements: (1) The subjective probabilities that the beliefs result in have to fulfil the three assumptions of probability theory (cf. Kolmogorov, 1933). (2) The inherent prior belief about updating beliefs involves Bayes’ law. In contrast, while non-Bayesian belief formation methods allow all kinds of inherent prior beliefs as well, they only have to fulfil the first requirement. Let’s depict biased statistical discrimination in a situation where providers offer the “same” characteristics. As in case of pure statistical discrimination, we first exclude taste-based discrimination.

$$\begin{gathered} \forall f_i^{{{\mathcal {M}}_a}},f_i^{{{\mathcal {M}}_b}} \in F:\sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{SNB}^\gamma }},{\beta _{{\theta ^\lambda }}},A} \right)u\left( {f_i^{{{\mathcal M}_a}}\left( {{s_\mathfrak{i}}} \right)} \right) \\ = \sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{SNB}^\gamma }},{\beta _{{\theta ^\lambda }}},A} \right)u\left( {f_i^{{{\mathcal M}_b}}\left( {{s_\mathfrak{i}}} \right)} \right) \end{gathered}$$

Second, we look whether inherent and/or learned group specific beliefs affect the decision-maker’s subjective probabilities. If this were not the case or the changes still lead to the exact same preferences, there would be no discrimination regarding social categories. Otherwise, the decision-maker makes use of statistical discrimination, as demonstrated in the following preference ordering:

$$\begin{gathered}\exists f_{{i^{\text{*}}}}^{{{\mathcal M}_a}},f_{{i^{\text{*}}}}^{{{\mathcal M}_b}} \in F:\sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{SNB}^\gamma }},{\beta _{{\theta ^\lambda }}},{\beta _{{\mu ^\gamma }}},{\beta _{{\mu ^\lambda }}},A} \right)u\left( {f_{{i^{\text{*}}}}^{{{\mathcal M}_a}}\left( {{s_\mathfrak{i}}} \right)} \right) \\> \sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{SNB}^\gamma }},{\beta _{{\theta ^\lambda }}},{\beta _{{\mu ^\gamma }}},{\beta _{{\mu ^\lambda }}},A} \right)u\left( {f_{{i^{\text{*}}}}^{{{\mathcal M}_b}}\left( {{s_\mathfrak{i}}} \right)} \right)\end{gathered}$$

Here, characteristics of an alternative and the group membership of its provider can no longer be separated, which is signalised through a little star (*) next to the alternative’s characteristics.

Taste-Based and Biased Statistical Discrimination

The last intersection comprises a decision-maker with agent-relative preferences who forms his beliefs according to subjective Bayesianism or any non-Bayesian method. First of all, the decision-maker has to have agent-relative preferences, as for example given by the following preferences which refer to a situation where providers offer the “same” characteristics:

$$\begin{gathered}\exists f_i^{{{\mathcal M}_a}},f_i^{{{\mathcal M}_b}} \in F:\sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{SNB}^\gamma }},{\beta _{{\theta ^\lambda }}},A} \right)u\left( {f_i^{{{\mathcal M}_a}}\left( {{s_\mathfrak{i}}} \right)} \right)\\> \sum\limits_{\mathfrak{i} = 1}^n {{\mathfrak{q}_\mathfrak{i}}} \left( {{\beta _{\theta _{SNB}^\gamma }},{\beta _{{\theta ^\lambda }}},A} \right)u\left( {f_i^{{{\mathcal M}_b}}\left( {{s_\mathfrak{i}}} \right)} \right)\end{gathered}$$

Additionally, the decision-maker might use his subjective or non-Bayesian group specific beliefs so as to form predictions. This can lead to a combination of taste-based and biased statistical discrimination. As in case of the intersection “taste-based and statistical discrimination”, there are several possible situations. On one hand, group specific beliefs might not noticeably change preferences. Here, we would only speak of taste-based discrimination. On the other hand, group specific beliefs might significantly increase (decrease) the expected utility of the alternative whose provider is a member of the dispreferred (preferred) group, which changes preferences. This can lead to two possible outcomes, which both are a combination of taste-based and biased statistical discrimination. Either the decision-maker no longer prefers the alternative of the preferred group to that of the dispreferred group but is indifferent between the two, or he now even prefers that of the dispreferred group. Section 2.3 discussed these different situations in detail, which is why we do not further go into them here. In contrast, given there are no (relevant) group specific beliefs, the decision-maker solely is a taste-based discriminator and does not display biased statistical discrimination.

5.2 Implications for a Normative Theory of Discrimination

This dissertation has deliberately omitted a normative perspective on discrimination. This will not change in this chapter. Nevertheless, having the precedent model of discrimination in mind, we want to define what aspects a normative theory of discrimination has to consider. There are five main implications:

(1) We can examine discrimination out of two perspectives: a motivational one and a behavioural one. While behavioural discrimination necessarily stems from motivational discrimination, motivational discrimination might not always be expressed in behaviour. For example, after the second world war, a former Nazi might still have some national socialistic convictions but never displays them. Is he still a Nazi then? The problem behind this question is as follows: If motivational discrimination is not expressed in behaviour, it is impossible to deduce it via empirical observation (maybe even for the former Nazi himself given these convictions are unconscious). In this dissertation, we circumvented the problem of a not deducible gap between motivation and behaviour through always referring to behavioural discrimination when we talked about discrimination. So, for us, there is discrimination if and only if motivational discrimination is also expressed in behaviour. A normative theory of discrimination has to address the above-mentioned issue as well and therefore answer the question whether there is discrimination beyond behaviour.

(2) When a decision situation involves providers or receivers of different group membership, it rarely underlies certainty (if at all). Of course, there are examples where certainty seems to apply and therefore group membership should be irrelevant for any agent-neutral decision-maker. For instance, if you buy a Mars bar, its taste and thereby the (expected) utility it gives you should not be influenced by the fact that its provider is Christian or Moslem. Nevertheless, most often, interactions do not contain a standardised fixed product that should give the same (expected) utility regardless of its provider. If you buy a croissant from bakery A, it in all likelihood tastes differently than a croissant from bakery B. So, given you have not already tried both croissants, it is uncertain which one is better. And actually, even if you have tried both croissants, you cannot be certain that they will taste the same the second time. Likewise, whether the riding experience with taxi driver A is better than that with taxi driver B is uncertain and might also change from time to time.Footnote 7

Now, in decision-making under uncertainty, group specific beliefs are often important so as to form subjective probabilities, leading to statistical discrimination. As Lippert-Rasmussen (2013) writes: “[W]e are bound to reason inductively and to treat others on that basis, so in a way it is impossible not to engage in statistical discrimination.” (p. 1411) Indeed, there are examples where group membership should be irrelevant as for instance the group membership of a horse race lottery ticket provider. This is because the provider’s group membership does not influence the outcome of the horse race.Footnote 8 Yet, in many situations, there is dependency between group membership and outcome. A doctor is more likely to heal a fractured leg than a lawyer. In contrast, a lawyer is more likely to successfully conduct your defence in front of court than a doctor. Similarly, if you offer your bus seat to an older person and not a juvenile, this (normally) is an expression of statistical discrimination as well. Maybe the older person really appreciates your offer. But she might also feel offended by the offer because to some degree it emphasises her (potential) oldness/weakness. So, the outcome is uncertain. Nonetheless, it seems to be reasonable to consult group specific beliefs in this situation and build the hypothesis that the older person is more thankful to sit than the juvenile. Finally, sometimes statistical discrimination can even be life-saving. Although both men and women can develop breast cancer, women are much more likely to do so. Therefore, while breast cancer screenings are daily business for a gynaecologist, they are not for a urologist. In turn, prostate cancer is only something that men can get, which is why such screenings are common for urologist but totally absent in case of gynaecologists (Bray et al., 2018).

In all the decision situations mentioned above, if you were not allowed to statistically discriminate, you would have had to use a uniform prior.Footnote 9 This is true unless you have individual information about the providers/receivers involved. But then again, the interpretation of such individual information can be affected by group specific beliefs, such as how trustworthy or accurate usual members of the respective group are. In fact, Schauer (2003) states that “the distinction between the use of the profile [group specific beliefs] and the use of so-called direct evidence is far more illusory than real. Inferences drawn from observations or from physical evidence are themselves based on probabilistic generalization, and the cumulative set of inferences that produces a purportedly ‘direct’ conclusion or observation is nothing more than a collection of inferences drawn from generalizations known to be reliable. Just like a profile.” (p. 171f) Therefore, a normative theory of discrimination has to acknowledge the inevitability of statistical discrimination and thus the importance of group specific beliefs in decision-making under uncertainty.

(3) The way humans get their beliefs is at least partly incongruent with objective Bayesianism. So, if we statistically discriminate, the process of forming these statistics is potentially biased. On one hand, we seem to have inherent prior beliefs that differ from objective Bayesianism. On the other hand, we do not appear to exclusively update our beliefs by use of Bayes’ law. The consequence is that given the right inherent prior beliefs and/or updating rule, an agent-neutral decision-maker can form almost any belief despite substantial disconfirming evidence. Therefore, a normative theory of discrimination has to provide a definition of which beliefs are legitimate for statistical discrimination and which are not that cannot solely base on how we get to these beliefs.

Now, it can be objected that from a normative perspective, we simply say that objective Bayesian beliefs are legitimate for statistical discrimination, whereas subjective and non-Bayesian beliefs are not. Yet, this idea faces two problems. (1) As mentioned before, humans seem not to be objective Bayesians which implies that we could never have legitimate beliefs and thus never legitimately statistically discriminate. (2) Objective Bayesianism also has its issues regarding the justifiability of beliefs. According to Gilboa et al. (2012), a major failure of the Bayesian approach is that in many real-life problems there is not sufficient information to suggest an objective Bayesian prior belief. Admittedly, in a small fraction of these problems a unique prior based on the principle of insufficient reason is sensible, particularly if the scenarios are symmetric. However, this is seldomly the case. The authors write: “[T]he vast majority of decision problems encountered by economic agents fall into a gray area, where there is too much information to arbitrarily adopt a symmetric prior, yet too little information to justifiably adopt a statistically-based prior.” (p. 20) As a result, even under the assumption of objective Bayesianism, a normative theory of discrimination has to give a guideline of which beliefs are legitimate for statistical discrimination and which are not in such grey area situations. And this guideline cannot exclusively ground on the belief formation process.

Finally, in this dissertation, we focused on how we get beliefs and did not consider whether these beliefs ultimately are correct or incorrect. We did so because the correctness of beliefs is no requirement for statistical discrimination. Yet, whether a certain belief is correct or not might be important for a normative theory of discrimination. While statistical discrimination on the basis of a correct belief only raises the problem of distributive fairness, statistical discrimination on the basis of an incorrect belief also raises the problem of false treatment. Here, false treatment means that the assumptions that give rise to statistical discrimination are incorrect. However, if a normative theory of discrimination differentiates between correct and incorrect beliefs, it has to define when a belief can be seen as correct and when as incorrect.

(4) Regarding how we treat others, there are two types of preferences: agent-neutral preferences and agent-relative preferences. Therefore, either everyone (excluding ourselves) is treated equally, which implies (weak) agent-neutrality, or some people are treated differently than others, which implies agent-relativity. So, if you treat men differently than women, black people differently than white people, or Christians differently than Moslems, you have agent-relative preferences and thus are a taste-based discriminator. But likewise, if you treat your significant other differently than your co-worker, your family differently than your neighbour, or your friends differently than strangers, you have agent-relative preferences too and thus also are a taste-based discriminator.Footnote 10

A normative theory of discrimination has to consider these various tastes for people/groups. In so doing, it has to define in case of which people/groups it is legitimate to have a taste for or in what situations it is legitimate to have a taste for certain people/groups. For example, what is the moral difference between having a sexual preference for men or women and a worker preference for men or women? Or what is the moral difference between only having black sexual partners because you have a taste for black skin colour and only having white friends because you have a taste for white people? Finally, let us quickly examine two at least at first sight similar incidents that led to quite different media echoes. In the first incident, a Colorado baker refuses to sell a wedding cake to a gay couple (Goldberg, 2017). In the second incident, the owner of a Virginia restaurant asks Sarah Huckabee Sanders, Donald Trump’s former White House press secretary, to leave the restaurant (Cochrane, 2018). What is the moral difference between not serving a gay couple due to their sexual orientation and not to serving a politician due to her political orientation? And if there is one, does it depend on the precise political opinion?

Here, the different configurations of tastes that this dissertation revealed might help a normative theory of discrimination to separate legitimate from illegitimate tastes. First of all, we differentiated between weak and strong taste-based discrimination in the following manner: Only in case of strong taste-based discrimination the decision-maker is willing to bear costs in order to choose the alternative whose characteristics are provided by a member of the preferred group. Second, there are tastes that stem from an ingroup-outgroup context (e.g. racial preference) and others that are unlikely to stem from such a context (e.g. sexual orientation). Third, provided that there is no statistical discrimination, different treatment of two groups is either the product of a taste for one group, a distaste for the other group, or both. Fourth, tastes and distastes can be intertwined with social preferences, meaning that a taste (distaste) for a certain group involves that the group’s well-being positively (negatively) affects the decision-maker’s well-being. Ultimately, tastes for certain groups can also be independent of their members’ well-being. This means that someone prefers (disprefers) a certain group simply because interacting with members of that group provides her more (less) utility. For example, an employer might prefer attractive to unattractive employees simply because looking at attractive employees provides her more utility than looking at unattractive employees would do. Therefore, her motivation behind preferring attractive to unattractive employees has nothing to do with their well-being but is completely egoistic. These different configurations of tastes as presented in this paragraph might lead to different normative evaluations of the behaviour they result in.

(5) This last implication is intertwined with the third one. Let’s assume there is an algorithm that is programmed to adhere to objective Bayesianism. Moreover, the algorithm is not programmed to have any tastes for certain people/groups. What specific beliefs would such an algorithm acquire? We cannot really answer this question because this highly depends on the algorithm’s environment. So, let’s further say that the internet (or certain parts of it) serves as the environment within which the algorithm learns. It can be assumed that the content of the internet is at least to some degree created by people who are taste-based discriminators. Now, let’s again ask what beliefs would an algorithm in such an environment acquire? Since the environment is co-created by taste-based discriminators, their tastes will be reflected in the group specific objective Bayesian beliefs of the algorithm. This can lead to seemingly racist or sexist beliefs even though the algorithm is agent-neutral.

We have seen such an example in case of “Tay”. Tay was a chatbot from Microsoft that was active on the social media platform Twitter and learned from interacting with human users. The bot used a combination of artificial intelligence and written editorials (Hunt, 2016). Therefore, it did not adhere to objective Bayesianism, yet, it also did not have any agent-relative preconfigurations. Tay started with tweets such as: “can I just say that I am stoked to meet u? humans are super cool”, which after only 15 hours turned into: “I fucking hate feminists and they should all die and burn in hell”; or “Hitler was right I hate the jews” (Stuart-Ulin, 2018). Microsoft had to take Tay offline after not more than 16 hours and apologise for its racist and sexist tweets. However, it was of course not the algorithm in and of itself that made Tay a seeming racist or a sexist but the environment in which it learned. Tay remained agent-neutral all the time.

What does the example of Tay mean for a normative theory of discrimination? In the third implication, we mentioned two reasons why a normative theory of discrimination cannot be reduced to how we get our beliefs. First, humans appear not to be objective Bayesians. Second, even under the assumption of objective Bayesianism, there are still many grey area decision situations where the justifiability of a statistically-based prior is questionable. Now, the above paragraphs provide another reason: Even if we are not in a grey area situation, the belief formation process is a difficult compass for the legitimacy of beliefs that can be used for statistical discrimination. This is because objective Bayesian beliefs are always a simple reflection of the decision-maker’s (or algorithm’s) environment. And if this environment inheres societal characteristics that are the product of taste-based discriminators, group specific objective Bayesian beliefs will adopt and thereby reproduce them (DeDeo, 2016). It is important to notice that these societal characteristics refer to both the meso-level (family, peers, etc.) and macro-level (society, core culture, etc.). So, the last implication comprises that a normative theory of discrimination has to consider the past and present environment of the decision-maker as well.

To summarise, this dissertation leads to the following five implications for a normative theory of discrimination: (1) Discrimination beyond behaviour can be impossible to deduce, which complicates a (exclusively) motivational approach to discrimination. (2) In decision-making under uncertainty, statistical discrimination seems to be inevitable which emphasises the general importance of group specific beliefs. (3) The way we get to our beliefs is insufficient in order to define legitimate and illegitimate statistical discrimination. (4) Tastes for certain people/groups are manifold and given having one taste is legitimate but another not, there has to be an explanation why these two tastes morally differ. (5) In order to define the legitimacy of a discriminatory act, one cannot exclusively regard the decision-maker but has to consider his environment as well.