As we have seen in the introduction, when we talk about discrimination, we normally talk about a certain kind of behaviour. If we treat a person or group differently compared to another person or group, this means that we behave differently depending on who our counterpart is. Therefore, dissecting discrimination implies dissecting the ways we behave in. The tool of analysis used in this dissertation in order to investigate and explain behaviour is decision theory.Footnote 1

A decision theory assumes that behaviour is foregone by a decision-making process: A displayed behaviour \({x}_{i}\) was chosen from a respective choice set \(X\), in which \({x}_{i}\), \(i\in I\), is one of the possible alternatives from choice set \(X\).Footnote 2 In this dissertation, we define that \(I\), which \(i\) is part of, is the set of all alternatives’ possible characteristics. For example, if someone wants to order one dish at a restaurant, the menu’s items are equivalent to her choice set \(X\). Let’s say that there are three alternatives in the menu, then \(X=\{{x}_{1},{x}_{2},{x}_{3}\}\). The dish \({x}_{i}\) that the person ultimately chooses has to be one of the three alternatives that the choice set \(X\) includes. But it has to be highlighted that the content of a decision-making process is diverse and not restricted to exchange processes such as buying a certain product. It involves interaction processes in a more general sense and thus also questions such as which of several future neighbours would I prefer or which of several strangers should I approach so as to ask for help. Additionally, a decision-making process can also contain hypothetical interaction processes and therefore hypothetical alternatives.

Having a set of alternatives which an individual can choose from is the first ingredient of a decision theory.Footnote 3 The second ingredient of a decision theory involves the preferences of the decision-maker and thus whether she prefers some alternatives to others and/or is indifferent between (some) alternatives. So, two random elements of \(X\) are compared to each other and put into relation: Either there is a preference relation (\(\succ ,\prec\)), meaning one alternative is preferred to the other; an indifference relation (\(\sim\)), meaning no alternative is preferred to the other; a combination of both (\(\succsim ,\precsim\)), meaning that both are possible; or the relation cannot be defined. Such a comparison is called a binary relation on \(X\). Overall, there are \(X\times X\) possible comparisons. In case of the menu described before, \(X\times X=\{({x}_{1},{x}_{1}),\mathrm{}\left({x}_{1},{x}_{2}\right),\mathrm{}({x}_{1},{x}_{3}),\mathrm{}({x}_{2},{x}_{1}),\mathrm{}({x}_{2},{x}_{2}),\mathrm{}({x}_{2},{x}_{3}),\mathrm{}({x}_{3},{x}_{1}),\mathrm{}({x}_{3},{x}_{2}),\mathrm{}({x}_{3},{x}_{3})\}\). (Kolmar, 2017)

There are three important assumptions regarding such comparisons of alternatives. First, when we compare an alternative \({x}_{i}\) to itself, we assume that there is an indifference relation between \({x}_{i}\) and \({x}_{i}\). This assumption is called reflexivity.

$${\bf{Assumption}}\,1\,({\bf{reflexivity}}):\forall {x_i} \in X:{x_i}\sim{x_i}$$

Second, given that every binary relation of \(X\times X\) can be defined through a preference relation, an indifference relation, or the combination of both, the assumption of completeness is fulfilled. Note that \({x}_{j}\), \(j\in I\), is a possible alternative from choice set \(X\) that is \(\ne {x}_{i}\).

$${\bf{Assumption}}\,2\,\left( {{\bf{completeness}}} \right):\forall {x_i},{x_j} \in X:{x_i} \succsim {x_j} \vee {x_j} \succsim {x_i}$$

Third, the assumption of transitivity says that in a choice set \(X\), which (among others) contains the alternatives \({x}_{i}\), \({x}_{j}\), and \({x}_{k}\), if \({x}_{i}\) is preferred (indifferent) to \({x}_{j}\) and \({x}_{j}\) is preferred (indifferent) to \({x}_{k}\), then \({x}_{i}\) is preferred (indifferent) to \({x}_{k}\) as well. Note that \({x}_{k}\), \(k\in I\), is a possible alternative from choice set \(X\) that is \(\ne {x}_{i}\) and \(\ne {x}_{j}\).

$${\bf{Assumption}}\,3\,\left( {{\bf{transitivity}}} \right):\forall {x_i},{x_j},{x_k} \in X:{x_i} \succsim {x_j} \wedge {x_j} \succsim {x_k}\mathop \Rightarrow \limits^! {x_i} \succsim {x_k}$$

In this dissertation, we presuppose that these three assumptions are fulfilled. Due to that we assume that individuals have a preference ordering: All possible alternatives \({x}_{i}\) of an individual’s choice set \(X\) are consistently ordered after how much they are preferred. Consequently, there is a well-defined subset of alternatives \({X}^{o}\subset X\) which describes the best or optimal alternative(s) considering the according preferences and choice set \(X\). In a next step, we also assume that individuals act according to their preferences. This means that they choose (one of) the best alternative(s) given their choice set \(X\) and their preference ordering. The consequent behaviour that emerges from such a decision-making process is then called rational.Footnote 4 (Kolmar, 2017).

Lastly, we assume that the choice sets that we analyse in this dissertation are always finite. Due to that we can express a preference ordering as a function. Such functional representations of preference orderings are called utility functions. So, the utility of an alternative \({x}_{i}\) and an alternative \({x}_{j}\) is given by \(u({x}_{i})\) and \(u({x}_{j})\). Next, \(u({x}_{i})\) and \(u({x}_{j})\) can then be put into relation regarding the utility they result in. This either leads to \(u\left({x}_{i}\right)>u\left({x}_{j}\right)\), \(u\left({x}_{i}\right)\ge u\left({x}_{j}\right)\), \(u\left({x}_{i}\right)=u\left({x}_{j}\right)\), \(u\left({x}_{i}\right)\le u\left({x}_{j}\right)\), or \(u\left({x}_{i}\right)<u\left({x}_{j}\right)\).

In this chapter, we analyse discrimination through the lens of decision theory as described above where individuals behave rationally. By examining different types of preference orderings or utility functions, we try to determine the various manifestations of discrimination. The first subchapter provides a general definition of which behaviour is discriminatory and which is not. Then, we will focus on the different ways identity and group membership can influence a preference ordering. We will do so regarding two decisional settings: under certainty and under uncertainty.Footnote 5 The last subchapter addresses the question of how we detect the accurate type(s) of discrimination in a given situation.

2.1 When Is There Discrimination?

Let us start our dissection of discrimination with an investigation of the word’s origin. Discrimination stems from discriminare, which is Latin for “to separate” or “to distinguish”. So, the original meaning of the word has nothing to do with how you treat people but is limited to perception. In this sense, you cannot discriminate against something but only between things. Without this ability, we would not be able to differentiate between two in fact different objects but perceive them as one and the same. Or we might know they are not the same but could not tell the difference between them. For example, an inexperienced wine-taster tastes two different wines, wine A and wine B, and is not able to distinguish them in a blind test because they taste the same to her. Yet, an experienced wine-taster notices that wine A is a little bit fruitier in the finish, whereas wine B is overall headier. Therefore, while the inexperienced wine-taster is not able to discriminate between wine A and wine B, the experienced wine-taster is.

Although today the word discrimination is no longer primarily used in this way, it still contains the original meaning as well. The Cambridge Dictionary (2018), which we already consulted for our definition of discrimination presented in the introductionFootnote 6, provides a second definition: “The ability to see the difference between two things or people.” As the original Latin word discriminare, this second definition of discrimination is restricted to perception. In the following statement, the author Christopher Hitchens (2005) precisely wanted to emphasis the perceptional and therefore original meaning of discrimination: “It especially annoys me when racists are accused of ‘discrimination.’ The ability to discriminate is a precious facility; by judging all members of one ‘race’ to be the same, the racist precisely shows himself incapable of discrimination.” (p. 109)

The statement of Hitchens seems to imply that the behavioural definition of discrimination, which involves how the expression is normally used today, and the perceptional one, which stems from the word’s original meaning, are at odds: A racist, who “discriminates” against other races by means of treating these races worse than her own race, is actually incapable of discrimination since otherwise she would not discriminate against other races. This is because if she were able to discriminate, she would realise that people of one race are very diverse and thus it is not sensible to judge them to be the same or use race as a relevant information.

However, the implication that the behavioural and perceptional definition of discrimination are in conflict is a fallacy. Indeed, a racist might not discriminate enough between people.Footnote 7 Yet, in order to be a racist, she has to be able to discriminate between different races. Let’s think of a blind person who is unaware of the fact that there are black and white people. If she has a black and a white individual in front of her, she cannot discriminate against the black or white individual because she is unable to distinguish their skin colour in the first place. So, the second definition is a requirement for the first: You can only treat people or things systematically differently if you are able to distinguish them. Otherwise, your different treatment is the product of chance and not discrimination.

The following example deepens the above argument through introducing the difference between motivational and behavioural discrimination. Let us assume that a non-blind person only gives tip to white waiters. While her first waiter was white and got a tip, the second waiter was black and did not get a tip. Thus, the non-blind person discriminates against black waiters. Now, a blind person would also like to do that, meaning she has the motivation to discriminate. However, she never knows the skin colour of her waiter and therefore cannot turn her motivation into behaviour. We assume that this makes her indifferent to giving or not giving a tip. Due to that she uses a heads or tails app on her phone, whereby heads produce a high and tails a low tone, so as to decide for her. In case of a low tone, she gives a tip, whereas a high tone implies not giving one. Applying this method, she gave a tip to the first waiter who happened to be white but not to the second waiter who happened to be black.

Obviously, in both cases the tip giver had the motivation to discriminate and the two waiters were ultimately treated differently since only one got a tip. Moreover, from a behavioural perspective both the non-blind and the blind person tipped the white but not the black waiter. So, at first sight it seems that both tip givers not only motivationally but also behaviourally discriminated against the black waiter. However, in case of the blind person, this is wrong because her different treatment was the product of chance. Given the first waiter was black and the second white, the black and not the white waiter would have got a tip. Consequently, the motivation to discriminate is not sufficient for behavioural discrimination. For that, a decision-maker also has to be able to identify the persons/things between whom/which she wants to behaviourally discriminate in the decision situation. In this dissertation, when we talk about discrimination we assume that the discriminator is able to do that and thus always discriminates on a behavioural level too.Footnote 8

Accordingly, this assumption also entails that an act of discrimination got triggered by some motivation or some beliefs and desires of the decision-maker. Such an approach requires a substantive interpretation of utility since we do not only want to analyse behaviour but also deduce the motivation and the psychological profile behind it. As Bermúdez (2009) writes: “The full force of thinking about decision theory as a regimentation of commonsense psychological explanation is only available on the substantive way of thinking about utility. If utility and probability assignments are to explain behavior in the way that attributions of beliefs and desires are thought to explain behavior then the utility and probability values must track psychologically real entities that are independent of the behavior being explained. There is relatively little explanatory power to be gained from explaining behavior in terms of probability and utility assignments if, as the operational theory [revealed-preference theory] holds, those assignments are simply redescriptions of the behavior being explained.” (p. 53) As a consequence, in this dissertation, utility is an independently specifiable quantity that is not simply a redescription of the decision-maker’s preferences.Footnote 9

Now, the tip example used above leads to two requirements that have to be fulfilled in order that an act is discriminatory: (1) In the decision situation, there has to be a differentiation between two or more things/people. (2) At least one of these things/people has to be treated in a systematically different way compared to the other things/people. If we transform these requirements into decision theory, we attain the following definition for discrimination: In a choice set \(X\), there are at least two alternatives \({x}_{i}\) and \({x}_{j}\) which are not equivalent. Furthermore, there is at least one alternative \({x}_{i}\) which is preferred to another alternative \({x}_{j}\).

$$\exists {x}_{i},{x}_{j}\in \mathrm{X}:{x}_{i}\ne {x}_{j}$$
$$\wedge \exists {x}_{i},{x}_{j}\in \mathrm{X}:{u(x}_{i})>{u(x}_{j})$$

Accordingly, an act is not discriminatory if there is no differentiation between two or more things/people or if none of the distinguished things/people is treated in a systematically different way compared to the other things/people.Footnote 10 In other words, in a choice set \(X\), there is only one alternative \({x}_{i}\) or multiple alternatives \({x}_{i}\) which are all equivalent or there is indifference between all alternatives that are part of \(X\).Footnote 11

$$X=\{{x}_{1}\}$$
$${\vee \forall x}_{i},{x}_{j}\in \mathrm{X}:{x}_{i}={x}_{j}\Leftrightarrow {\forall x}_{i}\in X:{x}_{i}\cup X={x}_{i}\cap X$$
$$\vee {\forall x}_{i},{x}_{j}\in \mathrm{X}:{u(x}_{i})=u({x}_{j})$$

Let us exemplify these last three definitions. The first one describes a situation where the choice set only contains one alternative. For example, you have to choose a dish from a menu that exclusively contains the daily special. The second one is very similar. Again, you only have one true alternative, yet, it seems like there is more than one. For example, a menu says that you can either order a burger with fries or fries with a burger. Since both alternatives are equivalent your actual choice set only contains one alternative. Finally, the third definition depicts a situation where you have different alternatives but are indifferent between all of them. For example, you are in a foreign country and do not understand one word of the menu. So, while you realise that there are different alternatives, you have no idea what they involve which makes you indifferent between them. But of course, such a situation can also occur if you very well know the difference between all your alternatives but simply are indifferent between them.

The circumstance that in this chapter we combined the perceptional and the behavioural definition of discrimination expanded the original behavioural definition: You do not only discriminate if you treat people differently in a systematic way but if you treat anything differently in a systematic way. Yet, whether someone discriminates against apples through preferring pears to apples is not per se of interest in this dissertation because it involves a non-social context and thereby what we call non-social discrimination. Consequently, in the next chapters, we focus on “treating a person or particular group” differently which is what we call social discrimination.Footnote 12

2.2 Social Discrimination Under Certainty

Decision-making under certainty implies that the decision-maker knows the exact outcome of a given alternative as well as the utility it provides. Under such circumstances, there is only one possible form of social discrimination, namely taste-based discrimination. The expression taste-based discrimination stems from Becker (1971). In his book The Economics of Discrimination he explores discrimination in the labour market, for example in form of wage gaps between male and female or white and black workers. Becker suggests that individual tastes for discrimination lead to these inequalities: An employer prefers a white to a black worker, even though the white worker might be less productive, in order to avoid interacting with black people. So, the employer has a taste or in other words a preference for a certain skin colour. However, it is important to notice that regarding the labour market, tastes for discrimination are not restricted to employers. Becker actually describes three models of which each covers a different source of discriminatory tastes: employers, co-workers, and customers. All of them can lead to discrimination in the labour market (Guryan & Charles, 2013). In this dissertation, we adopt Becker’s idea of taste-based discrimination and, via decision theory, expand it to behaviour in general.

At the beginning of our analysis of taste-based discrimination, we are only interested in who the involved provider of an alternative is. So, our choice set \(X\) consists of multiple alternatives that always have the same characteristics \(i\) (\(I=\{1\}\)) but still differ from each other because these characteristics are “offered” by different providers.Footnote 13 Thus, we assume that we can (at least theoretically) separate an alternative’s characteristics from their provider (who would normally be part of the characteristics).Footnote 14 Note that while the expression “characteristics are offered by different providers” seems to imply an exchange process between decision-maker and provider, this does not have to be the case. It actually includes interaction processes more generally. So, the expression “characteristics are offered by different providers” should rather be understood as “you can have these characteristics with that provider or that provider etc.”. Moreover, a provider does also not have to be aware of the fact that she offers these characteristics.Footnote 15 Now, within such a choice set \(X\), \({x}_{i}^{m}\) (\(m\in M\) and \(i\in I\)) embodies one possible alternative whose characteristics \(i\) (that in all alternatives are the same since \(I=\{1\}\)) are offered by provider \(m\). Here, \(M\), which \(m\) is part of, is the set of all possible providers that offer the alternatives’ characteristics.

For example, we want to buy a Mars bar \({(x}_{1})\) and can either do so from provider 1 or provider 2 to the same conditions. So, \(I=\left\{1\right\}\), \(M=\left\{\mathrm{1,2}\right\}\), and therefore \(X=\left\{{x}_{1}^{1},{x}_{1}^{2}\right\}\). The fact that providers offer to the same conditions is important because otherwise \({x}_{1}^{1}\) and \({x}_{1}^{2}\) would not have the same characteristics. Now, given we are not indifferent between these two alternatives, there is a case of taste-based discrimination. This means we prefer one provider to the other and thus gain more utility if we buy from one provider compared to the other although both offer the same characteristics. As a result, in such a situation, the identity of an alternative’s provider must in and of itself be relevant to us. In generalised terms, there is taste-based discrimination in a situation where providers offer the same characteristics \(i\) if the following requirements are fulfilled. Note that \({x}_{i}^{n}\), \(n\in M\), is a possible alternative from choice set \(X\) that is \(\ne {x}_{i}^{m}\) and only differs from \({x}_{i}^{m}\) in terms of the provider.

$$\exists {x}_{i}^{m},{x}_{i}^{n}\in \mathrm{X}:u\left({x}_{i}^{m}\right)>u\left({x}_{i}^{n}\right)$$

Accordingly, under the above-mentioned circumstances, there is a case of non-discrimination regarding providers’ identities if:

$$\forall {x}_{i}^{m},{x}_{i}^{n}\in \mathrm{X}:u\left({x}_{i}^{m}\right)=u\left({x}_{i}^{n}\right)$$

To continue, we analyse a situation where alternatives do not only differentiate regarding which provider offers the characteristics of an alternative but also regarding what these characteristics are. So now, \(I\) has more than one element. For example, an individual can choose between a Mars bar \({(x}_{1})\) and a Snickers bar \({(x}_{2})\). Moreover, there are two providers \((M=\{\mathrm{1,2}\})\), who both offer the two bars to the same conditions. The choice set \(X\) of the individual is as follows: \(X=\{{x}_{1}^{1},{x}_{2}^{1},{x}_{1}^{2},{x}_{2}^{2}\}\). First, we assume that the decision-maker is indifferent between Mars and Snickers. Thus, in a choice set \(\mathcal{X}\), where providers are unknown, the individual has the following preferences: \({x}_{1},{x}_{2}\in \mathcal{X}:u\left({x}_{1}\right)=u({x}_{2})\). Now, given the identity of providers is irrelevant, we should find the same preference ordering in case of a choice set \(X\) where the identity of providers is known:

$${x}_{1},{x}_{2}\in \mathcal{X}:{u(x}_{1})=u({x}_{2})$$
$$\begin{gathered} \wedge x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right) = u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{2} } \right) = u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{1} } \right) = u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{2} } \right) \hfill \\ = u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{1} } \right) = u(x_{1}^{2} ) \wedge u\left( {x_{2}^{1} } \right) = u(x_{2}^{2} ) \hfill \\ \end{gathered}$$

If this is the case, there is no taste-based discrimination. So, in generalised terms, there is non-discrimination regarding providers’ identities when alternatives have differing characteristics and an individual is indifferent between these if:

$${\forall x}_{i},{x}_{j}\in \mathcal{X}:u({x}_{i})=u({x}_{j})$$
$$\begin{gathered} \wedge \forall x_{i}^{m} ,x_{i}^{n} ,x_{j}^{m} ,x_{j}^{n} \in X:u\left( {x_{i}^{m} } \right) = u\left( {x_{j}^{m} } \right) \wedge u\left( {x_{i}^{n} } \right) = u\left( {x_{j}^{n} } \right) \wedge u\left( {x_{i}^{m} } \right) = u\left( {x_{j}^{n} } \right) \hfill \\ \wedge\,u\left( {x_{i}^{n} } \right) = u\left( {x_{j}^{m} } \right) \wedge u\left( {x_{i}^{m} } \right) = u(x_{i}^{n} ) \wedge u\left( {x_{j}^{m} } \right) = u(x_{j}^{n} ) \hfill \\ \end{gathered}$$

Second, we analyse a situation where the decision-maker prefers alternatives that contain characteristics \(i\) to alternatives that contain characteristics \(j\). For example, let us say that the decision-maker prefers Mars to Snickers. As a consequence, in a choice set \(\mathcal{X}\), where providers are unknown, the individual has the following preferences: \({x}_{1},{x}_{2}\in \mathcal{X}:{u(x}_{1})>u({x}_{2})\). Given that the decision-maker does not care about the identity of providers, we should find the following preference ordering in case of a choice set \(X\) where providers’ identities are known:

$${x}_{1},{x}_{2}\in \mathcal{X}:{u(x}_{1})>u({x}_{2})$$
$$\begin{gathered} \wedge x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{2} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{2} } \right)\text{ > }\,u\left( {x_{2}^{1} } \right) \hfill \\ \wedge \,u\left( {x_{1}^{1} } \right) = u(x_{1}^{2} ) \wedge \,u\left( {x_{2}^{1} } \right) = u(x_{2}^{2} ) \hfill \\ \end{gathered}$$

An individual with such a preference ordering does discriminate between the alternatives’ characteristics but is indifferent between the providers of these characteristics. Therefore, there is non-social discrimination but no taste-based discrimination. In generalised terms, there is non-discrimination regarding providers’ identities when alternatives have two differing characteristics and an individual prefers characteristics \(i\) to characteristics \(j\) if:Footnote 16

$$\exists !{x}_{i},{x}_{j}\in \mathcal{X}:u\left({x}_{i}\right)>u({x}_{j})$$
$$\begin{gathered} \wedge \forall x_{i}^{m} ,x_{i}^{n} ,x_{j}^{m} ,x_{j}^{n} \in X:u\left( {x_{i}^{m} } \right)\text{ > }\,u\left( {x_{j}^{m} } \right) \wedge u\left( {x_{i}^{n} } \right)\text{ > }\,u\left( {x_{j}^{n} } \right) \wedge u\left( {x_{i}^{m} } \right)\text{ > }\,u\left( {x_{j}^{n} } \right) \hfill \\ \wedge u\left( {x_{i}^{n} } \right)\text{ > }\,u\left( {x_{j}^{m} } \right) \wedge u\left( {x_{i}^{m} } \right) = u\left( {x_{i}^{n} } \right) \wedge u\left( {x_{j}^{m} } \right) = u\left( {x_{j}^{n} } \right) \hfill \\ \end{gathered}$$

A preference ordering which has an indifference relation between all providers that offer the same characteristics of an alternative is agent-neutral. The term agent-neutral was introduced by the philosopher Derek Parfit (1984) and builds on Thomas Nagel’s idea of objective and subjective reasons (Nagel, 1970). Nagel (1986) later adopted Parfit’s expressions and says: “If a reason can be given a general form which does not include an essential reference to the person who has it, it is an agent-neutral reason … If on the other hand, the general form of a reason does include an essential reference to the person who has it then it is an agent-relative reason.” (p. 152–153) For example, if an individual prefers Mars to Snickers, it would be an agent-neutral reason to always buy Mars, regardless of who the supplier is. However, given the individual prefers Mars to Snickers but also supplier A to supplier B, it would be an agent-relative reason to buy Mars only from supplier A and/or if supplier A does not have any Mars to rather buy Snickers from supplier A than Mars from supplier B.

Normally, agent-neutrality does not only include equal treatment of all others but equal treatment of all, including oneself. Therefore, if an agent has a reason to do something just in case her doing it would increase her welfare, that would be an agent-relative reason (Ridge, 2017). Yet, in this dissertation, when we speak of agent-neutral preferences, we do not necessarily presuppose that an agent has to treat herself the same way as she treats others. For example, a decision-maker has a choice set \(X\) with the following three alternatives: \({x}_{1}\) = “the decision-maker gets $100”; \({x}_{2}\) = “person 2 gets $100”: \({x}_{3}\) = “person 3 gets $100”. Without further information about these three individuals, it can be assumed that the decision-maker, person 2, and person 3 would all be equally happy to get $100. Thus, she has reason to give $100 to any of them (including herself), which should make her indifferent between the alternatives. Therefore, the preference ordering \({x}_{1},{x}_{2},{x}_{3}\in X:u\left({x}_{1}\right)=u\left({x}_{2}\right)\wedge u\left({x}_{2}\right)=u({x}_{3})\) is agent-neutral in the concept’s original sense. Now, additionally to this original use of agent-neutrality that we label as strong agent-neutrality, we introduce a second one that we call weak agent-neutrality: Given the decision-maker treats all her counterparts in the same but herself in a different way, her actions are weakly agent-neutral. In terms of the above example, a preference ordering is weakly agent-neutral if there is indifference between person 2 gets $100 and person 3 gets $100 but no indifference between the decision-maker gets $100 and person 2 or 3 gets $100. As a result, the preference orderings \({x}_{1},{x}_{2},{x}_{3}\in X:u\left({x}_{1}\right)>u\left({x}_{2}\right)\wedge u\left({x}_{2}\right)=u({x}_{3})\) and \({x}_{1},{x}_{2},{x}_{3}\in X:u\left({x}_{1}\right)<u\left({x}_{2}\right)\wedge u\left({x}_{2}\right)=u({x}_{3})\) are weakly agent-neutral.

With that in mind, we investigate preferences that are neither strongly agent-neutral nor weakly agent-neutral. Let us begin with a situation where someone is indifferent between the alternatives’ characteristics. For example, an individual can again choose between Mars \(({x}_{1})\) and Snickers \({(x}_{2}).\) So, in a choice set \(\mathcal{X}\), where providers are unknown, the individual has the following preferences: \({x}_{1},{x}_{2}\in \mathcal{X}:u\left({x}_{1}\right)=u\left({x}_{2}\right)\). Now, two providers \((M=\{\mathrm{1,2}\})\) offer the two goods. This results in the following choice set \(X\), where the identity of providers is known: \(X=\{{x}_{1}^{1},{x}_{2}^{1},{x}_{1}^{2},{x}_{2}^{2}\}\). We assume that through preferring provider 1 to provider 2, the individual has a taste for provider 1. This means that she prefers the alternatives that involve provider 1 to the alternatives that involve provider 2. Otherwise, she is indifferent. Therefore, her preference ordering is:

$${x}_{1},{x}_{2}\in \mathcal{X}:u\left({x}_{1}\right)=u({x}_{2})$$
$$\begin{gathered} \wedge \,x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right) = u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{2} } \right) = u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right) \hfill \\ \text{ > }\,u\left( {x_{1}^{2} } \right) \wedge \,u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{1}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right)\text{ > }\,u(x_{2}^{2} ) \hfill \\ \end{gathered}$$

In generalised terms, there is taste-based discrimination in a situation where alternatives differ regarding their characteristics and the decision-maker is indifferent between these characteristics if:

$${\forall x}_{i},{x}_{j}\in \mathcal{X}:u\left({x}_{i}\right)=u({x}_{j})$$
$$\wedge \exists {x}_{i}^{m},{x}_{i}^{n},{x}_{j}^{n}\in \mathrm{X}:u\left({x}_{i}^{m}\right)>u({x}_{i}^{n})\vee u\left({x}_{i}^{m}\right)>u\left({x}_{j}^{n}\right)$$

Finally, what if the decision-maker is not indifferent between (all) alternatives’ characteristics? So, beside her preference for certain providers, she also prefers some characteristics to others. To resume our Mars and Snickers example with the respective choice set \(X=\{{x}_{1}^{1},{x}_{2}^{1},{x}_{1}^{2},{x}_{2}^{2}\}\), the individual prefers both Mars \(({x}_{1})\) to Snickers \({(x}_{2})\) and provider 1 to 2. Regarding such preferences, five binary relations of \(X\times X\) are clear: \(({x}_{1}^{1}\succ {x}_{2}^{1})\),\(({x}_{1}^{2}\succ {x}_{2}^{2})\), \(({x}_{1}^{1}\succ {x}_{1}^{2})\), \(({x}_{2}^{1}\succ {x}_{2}^{2})\), and \(({x}_{1}^{1}\succ {x}_{2}^{2})\). However, what if only provider 2 offers Mars? Here, three binary relations are possible: \({(x}_{2}^{1}\succ {x}_{1}^{2})\) or \({(x}_{2}^{1}\prec {x}_{1}^{2})\) or \({(x}_{2}^{1}\sim {x}_{1}^{2}\)). The first binary relation is true if it is more important to the decision-maker that she gets her good from provider 1 and not from provider 2 compared to which good she gets. The second binary relation is true if it is more important to the decision-maker that she gets a Mars \(({x}_{1})\) and not a Snickers \(({x}_{2})\) compared to who the provider of the good is. Ultimately, the third binary relation is true if these two effects precisely balance each other out. Therefore, we attain the following preferences:

$${x}_{1},{x}_{2}\in \mathcal{X}:u\left({x}_{1}\right)>u({x}_{2})$$
$$\begin{gathered} \wedge \,x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{2} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \hfill \\ \wedge \,u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{1}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge \,\left( {u\left( {x_{2}^{1} } \right) \ge \,u\left( {x_{1}^{2} } \right) \vee (u\left( {x_{2}^{1} } \right) \ge \,u\left( {x_{1}^{2} } \right)} \right) \hfill \\ \end{gathered}$$

In generalised terms, there is taste-based discrimination in a situation where alternatives have two differing characteristics and the decision-maker prefers characteristics \(i\) to characteristics \(j\) if:

$${\exists !x}_{i},{x}_{j}\in \mathcal{X}:u\left({x}_{i}\right)>u\left({x}_{j}\right)$$
$$\begin{gathered} \wedge \exists {x}_{i}^{m},{x}_{i}^{n},\mathrm{}{x}_{j}^{m},{x}_{j}^{n}\in X:u\left({x}_{i}^{m}\right)>u\left({x}_{j}^{m}\right)\wedge u\left({x}_{i}^{n}\right)>u\left({x}_{j}^{n}\right)\wedge u\left({x}_{i}^{m}\right)>u\left({x}_{j}^{n}\right)\hfill \\\wedge\,u\left({x}_{i}^{m}\right)>u\left({x}_{i}^{n}\right)\wedge u\left({x}_{j}^{m}\right)>u\left({x}_{j}^{n}\right)\wedge \left(u\left({x}_{j}^{m}\right)\ge u\left({x}_{i}^{n}\right)\vee u\left({x}_{j}^{m}\right)\le u\left({x}_{i}^{n}\right)\right)\hfill \\ \end{gathered}$$

To summarise the above definitions, there is taste-based discrimination if the knowledge of who the providers of the alternatives’ characteristics are: (a) leads to a preference of one alternative over another even though they have the same characteristics; and/or (b) changes preferences compared to a situation where providers are unknown. This also implies that if a decision-maker has the following preference orderings, we cannot label the second one as a case of taste-based discrimination even if she might have a taste for provider 1:

$${x}_{1},{x}_{2}\in \mathcal{X}:u\left({x}_{1}\right)>u\left({x}_{2}\right)$$
$$\wedge {x}_{1}^{1},{x}_{2}^{2}\in X:u\left({x}_{1}^{1}\right)>u\left({x}_{2}^{2}\right)$$

The reason for this is that otherwise one could always argue that such preferences involve taste-based discrimination even though this is not empirically observable since the decision-maker also prefers \({x}_{1}\) to \({x}_{2}\) in a situation where she does not know providers’ identities. Taste-based discrimination would only get visible and therefore apply if for example there were a third alternative \({x}_{2}^{1}\) in choice set \(X\) which the decision-maker prefers to \({x}_{2}^{2}\).

2.2.1 Are There Different Shades of Taste-Based Discrimination?

If we look at the definitions of taste-based discrimination or no taste-based discrimination in situations where alternatives differentiate in both characteristics and provider, we make the following discovery: There are possible preference orderings that fall between our definitions. For example, our preference ordering regarding two alternatives with unspecified providers is \({x}_{1},{x}_{2}\in \mathcal{X}:{u(x}_{1})=u({x}_{2})\). Let’s say the two alternatives are again Mars \({(x}_{1})\) and Snickers \({(x}_{2})\). These goods are offered by two providers. On one hand, there is no taste-based discrimination if we are also indifferent between the providers of the goods:

$$\begin{gathered} x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right) = u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{2} } \right) = u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{1} } \right) = u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{2} } \right) \hfill \\ = u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{1} } \right) = u\left( {x_{1}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right) = u\left( {x_{2}^{2} } \right) \hfill \\ \end{gathered}$$

On the other hand, having the same circumstances, there is taste-based discrimination if we prefer provider 1 to provider 2 (or vice versa):

$$\begin{gathered} x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right) = u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{2} } \right) = u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \hfill \\ \wedge u\left( {x_{1}^{2} } \right)\text{ < }\,u\left( {x_{2}^{1} } \right) \wedge \,u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{1}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right)\text{ > }\,u(x_{2}^{2} ) \hfill \\ \end{gathered}$$

Now, in the above preference ordering, we always prefer the goods offered by provider 1 to those offered by provider 2. But what if this only sometimes is the case as for example in the following preference ordering:

$$\begin{gathered} x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right) = u\left( {x_{2}^{1} } \right) \wedge \,u\left( {x_{1}^{2} } \right) = u\left( {x_{2}^{2} } \right) \wedge \,u\left( {x_{1}^{1} } \right) = u\left( {x_{2}^{2} } \right) \wedge \,u\left( {x_{1}^{2} } \right) \hfill \\ = u\left( {x_{2}^{1} } \right) \wedge \,u\left( {x_{1}^{1} } \right) = u\left( {x_{1}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right) {>}\,{\text{u}}\left( {x_{2}^{2} } \right) \hfill \\ \end{gathered}$$

Here, we are always indifferent between providers except when both offer Snickers. In this case, we prefer to have Snickers from provider 1 and not from provider 2. Obviously, within such preferences there seems to be less taste-based discrimination than within the ones where provider 1 is always preferred. So, are these two different types of taste-based discrimination? Or might the last preference ordering not even fall under taste-based discrimination?

We start with the second question. As a reminder, we said that there is taste-based discrimination in a situation where alternatives differ regarding their characteristics and the decision-maker is indifferent between these characteristics if:

$${\forall x}_{i},{x}_{j}\in \mathcal{X}:u\left({x}_{i}\right)=u({x}_{j})$$
$$\wedge \exists {x}_{i}^{m},{x}_{i}^{n},{x}_{j}^{n}\in \mathrm{X}:u\left({x}_{i}^{m}\right)>u({x}_{i}^{n})\vee u\left({x}_{i}^{m}\right)>u\left({x}_{j}^{n}\right)$$

Therefore, even if an individual is indifferent between all alternatives except one, she still displays taste-based discrimination. This is because in this one binary relation \(\left({x}_{2}^{1}\succ {x}_{2}^{2}\right)\), the only reason why she could prefer the first to the second alternative is the different identity of the alternatives’ providers.

Let us continue with the question of multiple types of taste-based discrimination. For example, it could be said that there is weak and strong taste-based discrimination. A preference ordering that strictly prefers one provider to the other represents strong taste-based discrimination. In contrast, a preference ordering that only sometimes prefers one provider to the other and otherwise is indifferent between the two (or even prefers the other provider) represents weak taste-based discrimination. This idea is actually reasonable, yet, it applies on a different context. We do have to differentiate two situations. The first one is as described above: We are indifferent between the characteristics of our alternatives but not between the providers of those. If this is the case, we do not differentiate between different types of taste-based discrimination out of a simple reason. Given there is no strict preference for one provider over the other, the preference ordering becomes intransitive. For example, above we had a preference ordering where we were always indifferent except in one binary relation \(\left({x}_{2}^{1}\succ {x}_{2}^{2}\right)\). However, because of transitivity, we should actually be indifferent between \({x}_{2}^{1}\) and \({x}_{2}^{2}\), since we are also indifferent between \({x}_{1}^{1}\) and \({x}_{2}^{1}\) as well as \({x}_{1}^{1}\) and \({x}_{2}^{2}\). And due to the fact that we assume transitivity, no shades of taste-based discrimination are possible in such a situation.

The second situation involves a preference ordering where some characteristics and providers are preferred to others. For example, let’s say that an employer is looking for a new worker. Her choice set consists of two alternatives: \({x}_{1}\) = “highly productive workforce”; \({x}_{2}\)= “mediocrely productive workforce”. Moreover, each of the two alternatives are provided by a white (\({x}_{1}^{1},{x}_{2}^{1}\)) and a black person (\({x}_{1}^{2},{x}_{2}^{2}\)). Without knowing the identity of those who provide these characteristics, the employer of course prefers \({x}_{1}\) to \({x}_{2}\). However, if she also knows the provider’s identities, different types of taste-based discrimination are possible. We start with weak taste-based discrimination. It implies that if the decision-maker is indifferent between the characteristics of two alternatives, she chooses the alternative of the preferred provider. Otherwise, she chooses the alternative whose characteristics she prefers. Regarding the example, an employer who has a taste for white people prefers a white to a black worker if the white worker is more productive or if they are equally productive but a black to a white worker if the black worker is more productive than the white one. In formal terms:

$${x}_{1},{x}_{2}\in \mathcal{X}:u\left({x}_{1}\right)>u({x}_{2})$$
$$\begin{gathered} \wedge x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{2} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \hfill \\ \wedge\,u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{1}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right)\text{ > }\,u\left( {x_{1}^{2} } \right) \hfill \\ \end{gathered}$$

This differs from strong taste-based discrimination. Here, the decision-maker does not prefer an alternative whose characteristics are comparatively more favourable to those of another alternative, given she prefers the provider of the later. Regarding the example, an employer does not prefer a black worker who is highly productive to a white worker who is mediocrely productive due to a preference for white skin colour. Formally spoken:

$${x}_{1},{x}_{2}\in \mathcal{X}:u\left({x}_{1}\right)>u({x}_{2})$$
$$\begin{gathered} \wedge x_{1}^{1} ,x_{2}^{1} ,x_{1}^{2} ,x_{2}^{2} \in X:u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{1} } \right) \wedge u\left( {x_{1}^{2} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \hfill \\ \wedge\,u\left( {x_{1}^{1} } \right)\text{ > }\,u\left( {x_{1}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right)\text{ > }\,u\left( {x_{2}^{2} } \right) \wedge u\left( {x_{2}^{1} } \right) \ge u\left( {x_{1}^{2} } \right) \hfill \\ \end{gathered}$$

This reveals the difference between weak and strong taste-based discrimination. Only in case of strong taste-based discrimination, the decision-maker is willing to bear costs in order to choose the alternative whose characteristics are provided by the preferred person.Footnote 17 Regarding our example, the costs are less productivity.

2.2.2 Tastes for Groups

So far, we have always analysed choice sets with either two specific providers (e.g. \(X=\{{x}_{1}^{1},{x}_{2}^{1},{x}_{1}^{2},{x}_{2}^{2}\}\)) or with multiple providers of which we considered two possible ones (e.g. \(X=\{{x}_{i}^{m},{x}_{i}^{n},{x}_{j}^{m},{x}_{j}^{n}\}\). Now, we investigate a choice set \(X\) that consists of multiple alternatives which always have the same characteristics \(i\) (\(I=\{1\}\)) but four different providers of these characteristics (\(M=\{\mathrm{1,2},\mathrm{3,4}\})\). So, \(X=\{{x}_{1}^{1}\),\({x}_{1}^{2}\),\({x}_{1}^{3}\),\({x}_{1}^{4}\}\). Let us assume that an individual has the following preference ordering regarding this choice set \(X\):

$${x}_{1}^{1},{x}_{1}^{2},{x}_{1}^{3},{x}_{1}^{4}\in X:u\left({x}_{1}^{1}\right)=u\left({x}_{1}^{2}\right)\wedge u\left({x}_{1}^{3}\right)=u\left({x}_{1}^{4}\right)\wedge u\left({x}_{1}^{1}\right)>u\left({x}_{1}^{3}\right)$$

This implies that the decision-maker is indifferent between providers 1 and 2 and that she is also indifferent between providers 3 and 4, yet, prefers providers 1 and 2 to providers 3 and 4. Therefore, we can categorise the four providers into two groups. Group 1 consists of providers 1 and 2, whereas group two consists of providers 3 and 4. Within groups, the individual is indifferent between providers. However, between groups, she prefers group 1 to group 2. For example, you do not care whether you buy a Mars from Jack or John, and you are also indifferent whether you get it from Lisa or Lena. Nevertheless, you prefer male sellers to female sellers and thereby Jack and John to Lisa and Lena.

Following this argument, we can divide \(M\), which as a reminder is the set of all possible providers that offer the alternatives’ characteristics, into at least two subsets. We do this as follows: \(\Psi\) is the power set of \(M\) whereby the null set is excluded and thus no element of \(\Psi\). Next, \(A\) is a subset of \(\Psi\) with the requirement that the elements of \(A\) are disjoint and their union leads to M. This requirement is necessary because each provider should precisely be in one group. So, \(A\) defines which groups are salient in the respective decision situation and which provider belongs to which group.Footnote 18 Finally, \({v}_{a}\) and \({w}_{a}\) respectively \({v}_{b}\) and \({w}_{b}\) are two non-equivalent providers that belong to the subset \({\mathcal{M}}_{a}\) respectively \({\mathcal{M}}_{b}\).

$$\Psi ={2}^{M}=\{\dots ,\mathcal{C},\mathcal{D},\mathcal{E},\dots \}$$
$$A\subset \Psi$$
$${\mathcal{M}}_{a},{\mathcal{M}}_{b}\in A$$
$${\mathcal{M}}_{a}\cap {\mathcal{M}}_{b}=\mathrm{\varnothing }$$
$$\left\{m\in \bigcup _{a\in A}{\mathcal{M}}_{a}\mathrm{}\right\}=M$$
$${v}_{a},{w}_{a}\in {\mathcal{M}}_{a};{v}_{b},{w}_{b}\in {\mathcal{M}}_{b}$$

Applying this notation, there is taste-based group discrimination in a situation where providers offer the same characteristics if:

$$\begin{gathered} \forall x_{i}^{{v_{a} }} ,x_{i}^{{w_{a} }} ,x_{i}^{{v_{b} }} ,x_{i}^{{w_{b} }} \in X:u\left( {x_{i}^{{v_{a} }} } \right) = u\left( {x_{i}^{{w_{a} }} } \right) \wedge u\left( {x_{i}^{{v_{b} }} } \right) = u\left( {x_{i}^{{w_{b} }} } \right) \hfill \end{gathered}$$
$$\begin{gathered} \wedge \exists x_{i}^{{v_{a} }} ,x_{i}^{{v_{b} }} \in X:u\left( {x_{i}^{{v_{a} }} } \right)\text{ > }\,u\left( {x_{i}^{{v_{b} }} } \right) \hfill \end{gathered}$$

In this dissertation, we assume that all members within a group are always treated equally and therefore that there is indifference between providers who are members of the same group. As a result, we can simplify the above formulation because we do not have to regard the individuals within a group but can consider the groups as a whole:

$$\exists {x}_{i}^{{\mathcal{M}}_{a}},{x}_{i}^{{\mathcal{M}}_{b}}\in X:u\left({x}_{i}^{{\mathcal{M}}_{a}}\right)>u\left({x}_{i}^{{\mathcal{M}}_{b}}\right)$$

We see that this last formulation is very similar to the one of taste-based discrimination in a situation where providers offer the same characteristics:

$$\exists {x}_{i}^{m},{x}_{i}^{n}\in \mathrm{X}:u({x}_{i}^{m})>u({x}_{i}^{n})$$

The sole difference is that while in case of taste-based discrimination we talk about individual providers \(m\) and \(n\), in case of taste-based group discrimination we talk about group providers \({\mathcal{M}}_{a}\) and \({\mathcal{M}}_{b}\). The latter sum up all individuals who belong to a possible group \({\mathcal{M}}_{a}\) respectively \({\mathcal{M}}_{b}\). As a consequence, all definitions of taste-based discrimination can also be applied on a taste-based group discriminatory context. One has to simply replace \(m\) with \({\mathcal{M}}_{a}\) and \(n\) with \({\mathcal{M}}_{b}\). From now on, we are mainly interested in the group membership of providers and therefore no longer use \(m\) and \(n\) but \({\mathcal{M}}_{a}\) and \({\mathcal{M}}_{b}\). Additionally, we will no longer explicitly refer to taste-based discrimination that involves groups as taste-based group discrimination but simply call it taste-based discrimination as well.

2.3 Social Discrimination Under Uncertainty

So far, a respective alternative \({x}_{i}\) always led to a certain outcome and thereby utility for sure. This is no longer the case in decision-making under uncertainty which means that an alternative can lead to various outcomes. Additionally, the probabilities of these potential outcomes are subjective, meaning that the decision-maker must assess them with some degree of vagueness (Knight, 1921).Footnote 19 How can we explain a decision-maker’s behaviour if her choice underlies uncertainty? According to subjective expected utility theory, a decision-maker’s behaviour can be described as if she tries to maximise her expected utility in regard to some subjective probabilities.

Savage (1954) has provided the most well-known justification for subjective expected utility theory. Its strength is that it works without the necessity of any objective probabilities. But as Kreps (1988) writes: “[T]his strength comes at a price—obtaining the representation is … quite a hard task.” (p. 38) So, we have to ask whether the impossibility of objective probabilities per se is necessary so as to define social discrimination under uncertainty in this dissertation. The answer is no. Thus, we assume that there are objective randomising devices such as a perfect dice or a fair coin and due to that we can use a middle of the road formulation for subjective expected utility theory: the Anscombe-Aumann representation theorem.

Anscombe and Aumann (1963) use a similar setup as Savage (1954). There are four ingredients: (1) a finite set of states of the world, denoted by \(S\), where \({s}_{\mathfrak{i}}\in S,\mathfrak{i}=1,\dots ,n\); (2) an arbitrary set of prizes or consequences, denoted by \(Z\); (3) a set of all simple probability distributions on \(Z\), denoted by \(P\); and (4) a set of all functions from \(S\) to \(P\), denoted by \(H\), whose elements \(h\) are called acts. So, \(h({s}_{\mathfrak{i}})\), which we use interchangeably with \({h}_{\mathfrak{i}}\), \({h}_{\mathfrak{i}}\in P\), is the probability distribution on \(Z\) if the decision-maker chooses act \(h\) and \({s}_{\mathfrak{i}}\) occurs. Accordingly, if \(\mathfrak{i}=1,\dots ,n\), then \(h=({h}_{1},\dots ,{h}_{n})\).

Of course, the question of interest to a decision-maker is whether an act \(h\) or \(g\) (\(h,g\in H\)) provides a larger expected utility. This ultimately depends on how likely each of the states of the world is, which in turn is subjective. In order to solve this problem, we need seven assumptions. The first three are the same ones that we already defined at the beginning of chapter 2: reflexivity, completeness, and transitivity. We simply have to apply them on the elements of \(H\).Footnote 20 The other four are called continuity, independence, nontriviality, and monotonicity.Footnote 21 Continuity indicates that there is a tipping point (and no jump) between being worse than and better than a given middle act.

$$\begin{gathered} {\mathbf{Assumption}}\,{\mathbf{4}}\,\left( {{\mathbf{continuity}}} \right):{\text{For}}\,{\text{every}}\,h,g,l \in H,{\text{if}}\,h \succ g \succ l,{\text{there}}\,{\text{exist}}\,\alpha ,\beta \hfill \\ \in \left( {{\text{0}},{\text{1}}} \right)\,{\text{such}}\,{\text{that}}\,\alpha h + \left( {1 - \alpha } \right)l \succ g \succ \beta h + \left( {1 - \beta } \right)l. \hfill \\ \end{gathered}$$

Independence states that a preference ordering holds independently of the possibility of another act:

$$\begin{gathered} {\mathbf{Assumption}}\,{\mathbf{5}}\,\left( {{\mathbf{independence}}} \right):{\text{For}}\,{\text{every}}\,{\mkern 1mu} h,g,l \in H\;{\text{and}}\,{\mkern 1mu} {\text{every}}{\mkern 1mu} \,\alpha \in ({\text{0}},{\text{1}}),\hfill \\ h \succsim g~{\text{iff}}\,{\mkern 1mu} \alpha h + \left( {1 - \alpha } \right)l \succsim \alpha g + \left( {1 - \alpha } \right)l. \hfill \\ \end{gathered}$$

Nontriviality means that there is at least one act \(h\) in \(H\) that is preferred to some other act \(g\).Footnote 22

$${\mathbf{Assumption}}\,{\mathbf{6}}\,\left( {{\mathbf{nontriviality}}} \right):{\text{There}}\,{\text{exist}} \ h,g \in H\,{\text{such}}\,{\text{that}}\,h \succ g.$$

Monotonicity requires that “if two acts differ only on a single state, then the preference between these two acts is given by the preference between the lotteries that are assigned to that state” (Schneider & Schonger, 2017, p. 1), which implies state-independence of preferences.

$$\begin{gathered} {\mathbf{Assumption}}\,{\mathbf{7}}\,\left( {{\mathbf{monotonicity}}} \right):{\text{For}}\,{\text{every}}\,{\text{h}},g \in H,h\left( {s_{{\mathfrak{i}}} } \right)\succsim g\left( {s_{{\mathfrak{i}}} } \right) \ {\text{for}}\,{\text{all}}\,s_{{\mathfrak{i}}} \in \hfill\\ S \,{\text{implies}}\,h \succsim g. \hfill \\ \end{gathered}$$

If these seven assumptions are fulfilled, the Anscombe-Aumann representation theorem applies. Note that the subjective probability of a scenario \({s}_{\mathfrak{i}}\) is represented by \({\mathfrak{p}}_{\mathfrak{i}}\), \({\mathfrak{p}}_{\mathfrak{i}}\in \mathcal{P}\). \(\mathcal{P}\) is the set of all possible subjective probabilities. Moreover, it is important to notice that \({\mathfrak{p}}_{\mathfrak{i}}\) is not allowed to depend on the chosen act and therefore is the same for all acts in \(H\) (Kreps, 1988).

$$h\succ g\mathrm{\,}\mathrm{i}\mathrm{f}\mathrm{f}\mathrm{\,}\sum _{\mathfrak{i}=1}^{n}{\mathfrak{p}}_{\mathfrak{i}}\left[\sum _{z}u\left(z\right){h}_{\mathfrak{i}}\left(z\right)\right]>\sum _{\mathfrak{i}=1}^{n}{\mathfrak{p}}_{\mathfrak{i}}\left[\sum _{z}u\left(z\right){g}_{\mathfrak{i}}\left(z\right)\right]$$

This representation can be further simplified if we reduce \(H\) to a specific subset. Remember that one major difference between Anscombe and Aumann (1963) and Savage (1954) is that, in case of the former, acts do not directly lead to consequences but to simple probability distributions on consequences. This is why such acts are denoted by \(h\in H\) and not \(f\in F\) as in case of Savage. However, \(F\) can actually be identified with a particular subset of \(H\), namely the subset of those acts whose second lottery (the one after the subjective lottery) is degenerate (Kreps, 1988). We abuse the notation a bit and say that \(F\subset H\) and thus \(f\in F\) and \(f\in H\). Due to that we can simplify the Anscombe-Aumann representation theorem so long as the respective acts are elements of \(F\). Note that \(f\text{'}\in F\) and \(f\text{'}\ne f\).

$$f\succ f\mathrm{\text{'}}\mathrm{\,}\mathrm{i}\mathrm{f}\mathrm{f}\mathrm{\,}\sum _{\mathfrak{i}=1}^{n}{\mathfrak{p}}_{\mathfrak{i}}u\left(f\left({s}_{\mathfrak{i}}\right)\right)>\sum _{\mathfrak{i}=1}^{n}{\mathfrak{p}}_{\mathfrak{i}}u\left(f\mathrm{\text{'}}\left({s}_{\mathfrak{i}}\right)\right)$$

In the following, we will use this formulation in order to analysis discrimination under uncertainty. Therefore, the acts that we consider are always elements of \(F\). Moreover, we will no longer call the elements of \(F\) acts but simply alternatives whose outcomes are uncertain. \({f}_{i}\) is one of the possible alternatives from such choice set \(F\). Lastly, since states of the world is a rather lengthy expression we from now on call states of the world simply scenarios.

Now that we have a subjective expected utility theory we get to the next question. What defines these subjective probabilities? To start with, they are defined by Kolmogorov’s (1933) axiomatisation which can be seen as the three fundamental assumptions of probability theory. Let’s use \({\mathfrak{p}}_{\mathfrak{i}}\) interchangeably with \(\mathfrak{p}\left({s}_{\mathfrak{i}}\right)\), where \({s}_{\mathfrak{i}}\in S,\mathfrak{i}=1,\dots,n\):Footnote 23

  1. 1.

    \(\left( {{\text{Non-negativity}}} \right):{\mathfrak{p}}\left( {s_{{\mathfrak{i}}} } \right) \ge 0,{\text{for}}\,{\text{all}}\,{\text{s}}_{{\mathfrak{i}}} \in S.\)

  2. 2.

    \(\left( {{\text{Normalisation}}} \right):{\mathfrak{p}}\left( S \right) = 1.\)

  3. 3.

    \(\left( {{\text{Finite additivity}}} \right):{\mathfrak{p}}\left( {s_{{\mathfrak{i}}} \cup s_{{\mathfrak{j}}} } \right) = {\mathfrak{p}}\left( {s_{{\mathfrak{i}}} } \right) + {\mathfrak{p}}\left( {s_{{\mathfrak{j}}} } \right)\,{\text{for}}\,{\text{all}}\,{\text{s}}_{{\mathfrak{i}}} ,s_{{\mathfrak{j}}} \in S\,{\text{such}}\,{\text{that}}\)

    \( s_{{\mathfrak{i}}} \cap s_{{\mathfrak{j}}} = \emptyset \).

Yet, these three properties only set the frame of subjective probabilities. So, the question of what does ultimately determine them is still unanswered. In this dissertation, we assume that a scenario’s subjective probability is defined by our beliefs. \(\mathcal{B}\) is the set of all beliefs, whereby \(b\) is one possible belief. Importantly, \(A\), which we introduced in Section 2.2.2 and defines how we divide individuals into groups, can also be seen as a belief. So, we say that \(A\) is one of the elements of \(\mathcal{B}\). Next, \(\mathcal{B}\) is the power set of \(\mathcal{B}\) with the restriction that all elements of \(\mathcal{B}\) have to include \(A\). \(\mathfrak{b}\) is a possible element of \(\mathcal{B}\).

$$b\in \mathcal{B}$$
$$\mathcal{B}={2}^{\mathcal{B}};\mathfrak{b}\in \mathcal{B}$$
$$\forall \mathfrak{b}\in \mathcal{B}:\exists A\in \mathfrak{b}$$

Now, thanks to this setup, there has to be an element in \(\mathcal{B}\) that involves all beliefs that a decision-maker holds. Since that could be any element in \(\mathcal{B}\), the decision-maker’s beliefs are simply denoted by \(\mathfrak{b}\). Finally, we need a set of all functions from \(\mathcal{B}\) to \(\mathcal{P}\), denoted by \(\mathcal{Q}\), where \({\mathfrak{q}}_{\mathfrak{i}}\) is a possible element of \(\mathcal{Q}\). The expected utility of an alternative \({f}_{i}\) whose outcome underlies uncertainty is therefore given by:

$$\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}(\mathfrak{b})u\left({f}_{i}\left({s}_{\mathfrak{i}}\right)\right)$$

In order to define whether there is taste-based discrimination in a decision that involves multiple providers and uncertainty, we first have to partition a decision-maker’s beliefs \(\mathfrak{b}\) into three categories. The first category contains all beliefs that are group unspecific. We denote this subset of beliefs as \({\beta }_{\theta }\). The second category includes all beliefs that are group specific except for belief \(A\). We denote this subset of beliefs as \({\beta }_{\mu }\). The third category only includes belief \(A\). We denote this subset of beliefs as \({\beta }_{\pi }\). Using this partitioning, we attain the following subjective expected utility of an alternative \({f}_{i}\) whose provider belongs to \({\mathcal{M}}_{a}\) and whose outcome is uncertain. Note that due to \({\beta }_{\mu }\) the probability \({\mathfrak{p}}_{\mathfrak{i}}\) now considers beliefs that are group specific and in so doing also beliefs about \({\mathcal{M}}_{a}\).Footnote 24 Since the subset \({\beta }_{\pi }\) always exclusively contains the element \(A\), we will directly use \(A\) in the formulation. Finally, it is important to notice that \({\mathfrak{p}}_{\mathfrak{i}}\) is still the same for all alternatives \({f}_{i}\) in a choice set \(F\). So, this shall not be confused with the idea that a chosen alternative \({f}_{i}\) affects \({\mathfrak{p}}_{\mathfrak{i}}\) of which we said it is not possible.

$$\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },{\beta }_{\mu },A)u\left({f}_{i}^{{\mathcal{M}}_{a}}\left({s}_{\mathfrak{i}}\right)\right)$$

Thanks to this partitioning, we can isolate the influence of group specific beliefs \({\beta }_{\mu }\) on probabilities. In a next step, we exclude it from the probability function (\({\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },{\beta }_{\mu },A\right)\to {\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },A)\)) so as to assess whether there is taste-based discrimination. Here, it also becomes clear why we had to separate \(A\) from all other group specific beliefs because otherwise, if we excluded \({\beta }_{\mu }\), we could not draw back on our categorisation of individuals into groups. As a consequence, there would be no groups at all. Yet, we actually do want to have group categorisation but simply no further beliefs that are linked to these groups. Following these deliberations, there is taste-based discrimination in a situation where providers offer the same characteristics if:

$$\exists {f}_{i}^{{\mathcal{M}}_{a}},{f}_{i}^{{\mathcal{M}}_{b}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },A)u\left({f}_{i}^{{\mathcal{M}}_{a}}\left({s}_{\mathfrak{i}}\right)\right)>\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{i}^{{\mathcal{M}}_{b}}\left({s}_{\mathfrak{i}}\right)\right)$$

Accordingly, there is non-discrimination regarding providers’ group membership in a situation where providers offer the same characteristics if:

$$\forall {f}_{i}^{{\mathcal{M}}_{a}},{f}_{i}^{{\mathcal{M}}_{b}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{i}^{{\mathcal{M}}_{a}}\left({s}_{\mathfrak{i}}\right)\right)=\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{i}^{{\mathcal{M}}_{b}}\left({s}_{\mathfrak{i}}\right)\right)$$

We continue with the influence of providers’ group membership on subjective probabilities. The idea behind this is that the group membership of providers can serve as a proxy for how probable scenarios are. For example, let’s say you have broken your leg. There are two treatments: \({f}_{1}\) = “operation and cast”; \({f}_{2}\) = “only cast”. This leads to three scenarios: \({s}_{1}\) = “treatment 1 is better than treatment 2”; \({s}_{2}\) = “treatment 2 is better than treatment 1”; and \({s}_{3}\) = “both treatments are equally good”. Let’s say that without further information you assume the three scenarios to be equally likely. Now, you are told that the two treatments are provided by different persons. The only information you have about them is their professional group membership. While \({f}_{1}\) is provided by a doctor, the provider of \({f}_{2}\) is a lawyer (\(A=\{{\mathcal{M}}_{doctor},{\mathcal{M}}_{lawyer}\}\)). In all likelihood, you have group specific beliefs about doctors and lawyers that influences your subjective probabilities of the three scenarios: \({s}_{1}\) becomes more probable than the other two. Yet, as soon as you can no longer consult your group specific beliefs, the scenarios’ subjective probabilities are again the same ones as when the group membership of providers was unknown. Therefore, we can say that group specific beliefs are relevant if the consideration of both group specific and unspecific beliefs leads to different subjective probabilities than the consideration of only group unspecific beliefs.

From this point we can now define a phenomenon called statistical discrimination. The expression stems from Arrow (1972a, 1972b, 1973) and Phelps (1972), who proposed an explanation for discrimination in the labour market that differed from Becker’s (1971) idea of taste-based discrimination.Footnote 25 Their models suggest that an employer is imperfectly informed about some relevant characteristics (e.g. productivity) of her applicants and thus uses group statistics as proxies of these unobserved characteristics (Fang & Moro, 2011). This can lead to group inequalities in the labour market if employers (correctly) assume that on average members of some groups are more productive than those of others.Footnote 26

Applied on our setup, statistical discrimination implies that a decision-maker prefers an alternative \({f}_{i}^{{\mathcal{M}}_{a}}\) to an alternative \({f}_{i}^{{\mathcal{M}}_{b}}\) because of the influence that the providers’ group memberships has on the subjective probability of the alternatives’ scenarios. As a consequence, unlike in decision-making under certainty, in decision-making under uncertainty characteristics of an alternative and the group membership of its provider can no longer be always separated. More precisely, they are not separable if there is statistical discrimination. In such a situation we mark the \(i\) of \({f}_{i}^{{\mathcal{M}}_{a}}\) with a little star (*), leading to \({f}_{{i}^{*}}^{{\mathcal{M}}_{a}}\), which indicates that \(i\) actually is \({i}^{{\mathcal{M}}_{a}}\) and thus no longer equivalent to the \(i\) of \({f}_{{i}^{*}}^{{\mathcal{M}}_{b}}\) that now is \({i}^{{\mathcal{M}}_{b}}\). So, regarding a choice set \(F\) where providers offer the “same” characteristics, there is pure statistical discrimination if:Footnote 27

$$\forall {f}_{i}^{{\mathcal{M}}_{a}},{f}_{i}^{{\mathcal{M}}_{b}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{i}^{{\mathcal{M}}_{a}}\left({s}_{\mathfrak{i}}\right)\right)=\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{i}^{{\mathcal{M}}_{b}}\left({s}_{\mathfrak{i}}\right)\right)$$
$$\wedge \exists {f}_{{i}^{\mathrm{*}}}^{{\mathcal{M}}_{a}},{f}_{{i}^{\mathrm{*}}}^{{\mathcal{M}}_{b}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },{\beta }_{\mu },A\right)u\left({f}_{{i}^{\mathrm{*}}}^{{\mathcal{M}}_{a}}\left({s}_{\mathfrak{i}}\right)\right)>\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },{\beta }_{\mu },A)u\left({f}_{{i}^{\mathrm{*}}}^{{\mathcal{M}}_{b}}\left({s}_{\mathfrak{i}}\right)\right)$$

Why is there not a greater-than-or-equal sign in the last equation? Indeed, the fact that there is statistical discrimination does not necessarily have to imply that the alternatives’ expected utilities change compared to a situation where probabilities are independent of group specific beliefs. However, given a decision that involves statistical discrimination leads to the exact same result as one that does not, it is impossible to empirically observe whether there truly was statistical discrimination. Due to that it could always be argued that an action actually involved statistical discrimination even though it was not observable. This poses a problem because it dilutes statistical discrimination as a concept of analysis. Thus, so as to make a virtue out of necessity, our definition of statistical discrimination requires that the use of group specific beliefs changes the decision-maker’s preferences and thereby behaviour. This is the reason why there is a greater-than sign and not a greater-than-or-equal sign.

Due to the above definition of pure statistical discrimination, it is straightforward when there is neither taste-based nor statistical discrimination in a situation where providers offer the “same” characteristics:

$$\forall {f}_{i}^{{\mathcal{M}}_{a}},{f}_{i}^{{\mathcal{M}}_{b}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{i}^{{\mathcal{M}}_{a}}\left({s}_{\mathfrak{i}}\right)\right)=\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{i}^{{\mathcal{M}}_{b}}\left({s}_{\mathfrak{i}}\right)\right)$$
$$\wedge \forall {f}_{i}^{{\mathcal{M}}_{a}},{f}_{i}^{{\mathcal{M}}_{b}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },{\beta }_{\mu },A\right)u\left({f}_{i}^{{\mathcal{M}}_{a}}\left({s}_{\mathfrak{i}}\right)\right)=\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },{\beta }_{\mu },A)u\left({f}_{i}^{{\mathcal{M}}_{b}}\left({s}_{\mathfrak{i}}\right)\right)$$

Now, let’s go through the other combinations. We do so under the assumption that \(A={\{\mathcal{M}}_{1},{\mathcal{M}}_{2}\}\), \(I=\{1\}\), and \(F=\left\{{f}_{1}^{{\mathcal{M}}_{1}},{f}_{1}^{{\mathcal{M}}_{2}}\right\}\). First, we examine a situation where there is both taste-based and statistical discrimination, yet, the combination of them seems to imply that there actually is no discrimination at all. In formal terms:

$${f}_{1}^{{\mathcal{M}}_{1}},{f}_{1}^{{\mathcal{M}}_{2}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{1}^{{\mathcal{M}}_{1}}\left({s}_{\mathfrak{i}}\right)\right)>\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },A)u\left({f}_{1}^{{\mathcal{M}}_{2}}\left({s}_{\mathfrak{i}}\right)\right)$$
$$\wedge {f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{1}},{f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{2}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },{\beta }_{\mu },A\right)u\left({f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{1}}\left({s}_{\mathfrak{i}}\right)\right)=\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },{\beta }_{\mu },A)u\left({f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{2}}\left({s}_{\mathfrak{i}}\right)\right)$$

The interpretation of such a situation is as follows: A decision-maker generally prefers the group membership of one provider (\({\mathcal{M}}_{1}\)) to that of the other (\({\mathcal{M}}_{2}\)). This implies that the prizes of the preferred provider give the decision-maker more utility than the exact same prizes of the dispreferred provider (\({\sum }_{\mathfrak{i}=1}^{n}u\left({f}_{1}^{{\mathcal{M}}_{1}}\left({s}_{\mathfrak{i}}\right)\right)>{\sum }_{\mathfrak{i}=1}^{n}u\left({f}_{1}^{{\mathcal{M}}_{2}}\left({s}_{\mathfrak{i}}\right)\right)\)). However, groups specific beliefs of the decision-maker change subjective probabilities in such a way that the expected utility of \({f}_{{1}^{*}}^{{\mathcal{M}}_{2}}\) gets larger in comparison to the expected utility of \({f}_{{1}^{*}}^{{\mathcal{M}}_{1}}\). These two effects precisely balance each other out so that ultimately the decision-maker is indifferent between the two alternatives.

The following example should illustrate these deliberations: Again, you have a broken leg and your choice set \(F\) contains two treatments with the same characteristics \(1\) but providers of different group membership.Footnote 28 While the provider of treatment 1 is a lawyer, treatment 2 is provided by a doctor. Thus, \(F=\{{f}_{1}^{{\mathcal{M}}_{lawyer}},{f}_{1}^{{\mathcal{M}}_{doctor}}\}\). Generally, you prefer lawyers to doctors which means that the utility of prizes provided by a lawyer is larger than the utility of the exact same prizes provided by a doctor. Now, there are three scenarios (\(S=\{{s}_{1},{s}_{2},{s}_{3}\}\)): \({s}_{1}\) = “treatment 1 is better than treatment 2”; \({s}_{2}\) = “treatment 2 is better than treatment 1”; and \({s}_{3}\) = “both treatments are equally good”. Without considering group specific beliefs, each scenario is equally likely. As a consequence, the treatment provided by the lawyer leads to more expected utility than that of the doctor (\({f}_{1}^{{\mathcal{M}}_{lawyer}}\succ {f}_{1}^{{\mathcal{M}}_{doctor}}\)). However, as soon as you also regard group specific beliefs, your subjective probabilities of the three scenarios start to change. \({s}_{2}\) gets a higher subjective probability since doctors are associated with medical expertise, which is not the case for lawyers. The higher subjective probability of \({s}_{2}\) starts to compensate for the lower utility that the doctor’s prizes generally provide. At one point, this compensating effect precisely balances the expected utility of the two treatments out (\({f}_{{1}^{*}}^{{\mathcal{M}}_{lawyer}}\sim {f}_{{1}^{*}}^{{\mathcal{M}}_{doctor}}\)).

In fact, the compensating effect can also lead to a situation where the change of subjective probabilities due to group specific beliefs outcompetes a general preference for \({\mathcal{M}}_{1}\) over \({\mathcal{M}}_{2}\):

$${f}_{1}^{{\mathcal{M}}_{1}},{f}_{1}^{{\mathcal{M}}_{2}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{1}^{{\mathcal{M}}_{1}}\left({s}_{\mathfrak{i}}\right)\right)>\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },A)u\left({f}_{1}^{{\mathcal{M}}_{2}}\left({s}_{\mathfrak{i}}\right)\right)$$
$$\wedge {f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{1}},{f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{2}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },{\beta }_{\mu },A\right)u\left({f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{1}}\left({s}_{\mathfrak{i}}\right)\right)<\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },{\beta }_{\mu },A)u\left({f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{2}}\left({s}_{\mathfrak{i}}\right)\right)$$

Finally, on one hand, subjective probabilities might change due to group specific beliefs and make \({f}_{{1}^{*}}^{{\mathcal{M}}_{2}}\) more attractive. Nevertheless, their change is not strong enough in order to outcompete or balance out a general preference for \({\mathcal{M}}_{1}\) over \({\mathcal{M}}_{2}\). On the other hand, a change of subjective probabilities has either no effect on the alternatives’ utilities or even additionally increases the utility of \({f}_{{1}^{*}}^{{\mathcal{M}}_{1}}\):

$${f}_{1}^{{\mathcal{M}}_{1}},{f}_{1}^{{\mathcal{M}}_{2}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },A\right)u\left({f}_{1}^{{\mathcal{M}}_{1}}\left({s}_{\mathfrak{i}}\right)\right)>\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },A)u\left({f}_{1}^{{\mathcal{M}}_{2}}\left({s}_{\mathfrak{i}}\right)\right)$$
$$\wedge {f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{1}},{f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{2}}\in F:\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}\left({\beta }_{\theta },{\beta }_{\mu },A\right)u\left({f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{1}}\left({s}_{\mathfrak{i}}\right)\right)>\sum _{\mathfrak{i}=1}^{n}{\mathfrak{q}}_{\mathfrak{i}}({\beta }_{\theta },{\beta }_{\mu },A)u\left({f}_{{1}^{\mathrm{*}}}^{{\mathcal{M}}_{2}}\left({s}_{\mathfrak{i}}\right)\right)$$

Yet, as previously mentioned, if changes in subjective probabilities due to group specific beliefs do not alter preferences and thereby behaviour, we do not speak of statistical discrimination. So, the above circumstances would be a case of taste-based discrimination alone. Again, the reason for this is that otherwise one could always argue that such a situation involves statistical discrimination even though it is not empirically observable.

2.4 How to Detect the Accurate Type(s) of Discrimination

Our decision-theoretical analysis of discrimination has led to the following distinctions: First of all, we separated motivational discrimination from behavioural discrimination and said that we mean the combination of both when we talk about discrimination, meaning motivational

Figure 2.1
figure 1

All types of discrimination used in this dissertation

discrimination that gets expressed in behavioural discrimination. Then, we defined the requirements for discrimination. Next, we differentiated between social and non-social discrimination.Footnote 29 In case of social discrimination, we identified two subtypes, namely taste-based discrimination and statistical discrimination. They can be combined with each other and/or with non-social discrimination. Figure 2.1 summarises all types of discrimination.

Although these types of discrimination are always distinguishable from each other in theory, this is not the case empirically since they can lead to the exact same behaviour. For example, the last chapters have shown that there are special constellations of different types of discrimination that lead to preferences which on first sight look as if they were non-discriminatory. Let’s say you prefer Mars to Snickers and group A to group B. Now, while group B only offers Mars, group A only offers Snickers. Due to that you are indifferent between Snickers from group A and Mars from group B. This preference ordering gives the impression that you are non-discriminatory. However, you actually display non-social and taste-based discrimination. The same seemingly non-discriminatory outcome is possible if there is a special constellation of taste-based and statistical discrimination, non-social and statistical discrimination, or non-social, taste-based, and statistical discrimination. And as the following paragraphs will show, there are further actions that look the same although they stem from different types of discrimination. So, how do we know which one applies?

This question touches a very general problem of the analysis of behaviour or more precisely empirical observations that will indirectly accompany us the whole dissertation: What can we know from empirical observation? This issue has been discussed for centuries. For example, Immanuel Kant (2011[1785]) examined whether someone’s behaviour can exclusively stem from moral grounds and came to the following conclusion: “In fact, it is absolutely impossible by means of experience to make out with complete certainty a single case in which the maxim of an action that otherwise conforms with duty did rest solely on moral grounds and on the representation of one’s duty.” (p. 43) Applied on discrimination, this means that we can never certainly tell what type of discrimination an act actually involved (if any).

Yet, despite this epistemological limitation, through observing other acts we can attain a basis of comparison and in this way try to (at least partly) deduce the relevant form of discrimination. For instance, let’s say that you see someone not tipping a white waiter.Footnote 30 There are multiple possible explanations for this behaviour such as: (1) The person does never give tip. (2) The person gives tip randomly. (3) The person only tips if the service was extraordinary which was not the case in that situation. (4) The person has a group specific belief which says that white people are generally rather affluent which is why she did not tip the white waiter. Or (5) the person has a distaste for white waiters/people. Of course, there are actually more than these five explanations. But let’s restrict ourselves to them for the moment being and treat them as if they were mutually exclusive.

Now, a day later, you see the same person tipping a black waiter. This different treatment of black and white waiters can still have various reasons: (1) The person gives tip randomly. (2) While the service of the white waiter was not worthy of tip, the service of the black waiter was. (3) The person has a group specific belief which says that while white people are generally rather affluent, black people are generally rather poor which is why she only tips black waiters. Or (4) the person has a taste for black waiters/people or a distaste for white waiters/people (or both).

Next, you observe a hundred restaurant visits of this person and notice that she never gives tip to white waiters (68 times) but always to black waiters (32 times). On one hand, it is highly unlikely that the quality of service was always worse in case of white waiters than in case of black waiters. On the other hand, the fact that all black but no white waiters got a tip strongly challenges the idea of randomness. Thus, we assume that only two explanations remain: (1) The person has a group specific belief which says that while white people are generally rather affluent, black people are generally rather poor which is why she only tips black waiters. (2) The person has a taste for black waiters/people or a distaste for white waiters/people (or both). Regarding our empirical observations, it is difficult to deduce which one of these two is correct.Footnote 31 We would need a situation where the person’s group specific belief gets overruled by another belief, namely that her current white waiter is rather poor or that her current black waiter is rather affluent. Supposing such conditions, if the person still exclusively tips black waiters, she probably has a taste for black waiters/people or a distaste for white waiters/people (or both). Alternatively, if the person does tip a poor white waiter or does not tip an affluent black waiter, her previous different treatment seems to have been due to statistical discrimination.

We see that in order to detect the accurate type(s) of discrimination we need a basis of comparison and thus as many empirical observations as possible. Additionally, we have to thoroughly analyse the two types of social discrimination. What do they actually include? Why do we display or, putting it differently, what purpose do they have? What are the psychological mechanisms behind them and how are they composed? Is it possible to identify them in empirical observations, for example through controlling all other influences in experimental settings? The answers to these questions will help us to deduce the accurate type(s) of discrimination in a cluster of empirical observations. This is why the next two main chapters of this dissertation enlarge upon taste-based and statistical discrimination (more precisely the beliefs used for it). We start with the former.