1 Introduction

Discrimination against certain social groups over long time periods has been a historical feature of many societies. For instance, in the US discrimination in the form of slavery officially ended in 1865 after more than two centuries, though racial segregation was maintained in the form of Jim Crow laws until 1965.Footnote 1 Starting with the civil rights movements in the early 1960’s, one has seen significant advances in the rights and outcomes of the black population. However, today the black population still lags behind whites in a range of socio-economic characteristics. In India, caste, which is inherited by birth, was a marker for social discrimination for centuries. At independence in 1947, the practice of untouchability was made illegal and affirmative action was enshrined in the constitution for disadvantaged groups. However, the lower castes continue to trail significantly behind other social groups in terms of most socio-economic indicators. What contributes to the gap between groups that faced discrimination over long time periods and those that did not? In what outcomes and why might we observe persistent gaps?

In this paper, we posit a channel of discrimination, where even under perfect observability of individual ability, the absence of discriminatory social norms, and when taste for discrimination has already died out, to discriminate can be the optimal response. The theoretical mechanism put forth rests on the existence of beliefs about discrimination by others in society, and on distinguishing between activities characterized by the need for interlinkages versus no need for interlinkages. In our model, activities with interlinkages require coordinated actions. If an individual decides to establish interlinkages, she requires the input of two principals to form a productive unit. The success and return for all, the individual and the two principals, is contingent on the participation of all three in the venture. The coordination failure results from the belief that somebody else might discriminate and refuse to participate in the venture, which imposes losses due to the complementarity of inputs in the production process.

The classic example would be the case of entrepreneurs who need to establish multiple interlinkages (productive relations) to be able to start and operate a venture (Basu 2010). In the theoretical model, individuals choose between entering activities which require establishing productive relations and those that do not. Individuals intending to enter activities involving interlinkages are randomly matched with a pair of principals, for instance a lender and a distributor, with whom they need to establish interlinkages to form a productive unit. The individual cannot produce without capital and cannot sell without a distributor. In case one of the principals agrees to participate and the other does not, the investment of the first principal is held up and imposes a fixed cost. We show how in the presence of beliefs about discrimination against a certain group, principals without a taste for discrimination also discriminate against that group in equilibrium.

We derive the conditions under which the model predicts lower participation rates and higher cost of establishing interlinkages for the discriminated group relative to the non-discriminated group in equilibrium, leading to an overall welfare loss for society. The model also establishes conditions under which the steady-state equilibrium is characterized by the existence of discrimination due to beliefs about the existence of taste discriminators, although there are no taste discriminators left in society. The persistence of beliefs regarding discrimination in the steady state are interpreted as intergenerational transmission of beliefs in the sense of collective memories, consistent with utility maximizing or cultural trait preserving strategies.

We empirically test our theoretical predictions in the market for self-employment, an occupation requiring the establishment of interlinkages. In particular, our focus is on the the market for self-employment of blacks and whites in the United States. We show that the representation and payoff for the discriminated group in self-employment, as well as the probability and cost of establishing interlinkages, are in line with our theoretical predictions.

Next, using data from the General Social Survey (GSS) from the years 1972–2012 (Smith et al. 2012), we create proxies of beliefs about and tastes for discrimination against blacks by year and region to determine whether the presented belief-based mechanism finds any support in the data. The time trends of taste for discrimination and beliefs about discrimination from the GSS and the self-employment rates for blacks and whites from the Current Population Survey (CPS) for the time period 1972–2012 are shown in Fig. 1.Footnote 2 Taste for discrimination against blacks linearly declines over the observed period, whereas beliefs about discrimination against blacks as well as the gap between the self-employment rates for blacks and whites remain remarkably constant. Figure 1 captures the mechanism and the role of sticky or unchanging beliefs highlighted by the theoretical model in a snapshot. The unchanging beliefs perfectly correspond to the invariant gap in self-employment rates over the period analyzed, as predicted by the theoretical framework. Using a logit model, we find our proxy for beliefs about presence of discrimination to be a significant and negative correlate of the probability of becoming self-employed for blacks in the US. The results are robust to the inclusion of a race dummy to account for other unobservable characteristics of racial groups, as well as year and region fixed effects. Furthermore, using the National Survey of Small Business Finances of 1998 and 2003, we show that beliefs about discrimination also explain other features predicted by our model, namely that beliefs are a significant and positive correlate of blacks having their loan application rejected and being charged higher interest rates, and blacks reporting that they do not apply for a loan due to fear of rejection. The presented statistical associations are persistent across a variety of specifications and present strong evidence in favor of the theoretical framework, though no causal claims can be made on the basis of the available data.

Fig. 1
figure 1

Self-employment rates by race and beliefs and taste regarding discrimination in the US

The literature of the economics of discrimination was pioneered by the seminal work of Becker (1957). In the setting envisaged, employers hold a taste for discrimination, such that working with members of a particular group imposes a cost on them, and hence these workers have to compensate the employer by either being more productive or accepting lower wages. Extensions involving mechanisms based on consumer discrimination (Borjas and Bronars 1989; Nardinelli and Simon 1990) are again contingent upon the presence of consumers who dislike purchasing from or interacting with members of another race, or in other words, individuals possessing a taste for discrimination. The class of models of statistical discrimination (Phelps 1972; Arrow 1973; Aigner and Cain 1977; Lundberg and Startz 1983, 1998; Coate and Loury 1993; Rosén 1997) and categorical thinking (Fryer and Jackson 2008) rely on the imperfect observability of worker productivity. In absence of complete information, employers base their decision on easily observable characteristics, such as race or gender, to infer the expected productivity of the worker. Mailath et al. (2000) present a model of endogenous discrimination arising from the search decision of firms. The asymmetric discriminatory equilibrium is supported due to the belief that there are more skilled workers available of a particular type, which is borne out in equilibrium. The third class of models is that of Akerlof (1976, 1985) and Peski and Szentes (2013), where not following the established norm of discrimination against certain groups might result in imposition of social sanctions which cause economic losses, making discrimination a rational response. Lang et al. (2005) present a model of a labor market characterized by wage postings where even weak discriminatory preferences can lead to large wage differences and labor market segregation. They discuss how the discriminatory equilibrium is not contingent on the actual use of discriminating hiring strategies by firms but requires only that the black workers believe that they do. The key distinction from our work is that in their setting, discrimination arises from just one side of the market. Thus, if a black person was to apply by mistake, that is, not accounting for her belief about discrimination, her application would be accepted and this would lead to an unraveling of the self-confirming equilibrium. For the equilibrium to unravel in our setting, we require that both the applying individual and the accepting principals do not to account for their beliefs about discrimination. Thus, the self-confirming equilibrium is much more robust and also extends to a range of other phenomena such as neighborhood segregation and ethnic patronage in society (see Sect. 3.3).

Our model contributes a new mechanism as to how discrimination can persist, and provides an alternative channel through which phenomena such as racial segregation or ethnic patronage can arise and persist in societies. In our setting the distribution of ability within the two groups is identical ex-ante and ex-post, there is perfect observability of ability, and there are no social norms to discriminate. Moreover, the nature of the coordination failure highlighted does not allow for a single principal who does not discriminate to reap the unrealized profits, a possibility traditionally assumed by Becker (1957), therefore providing a theoretical rational as to why discrimination can persist. To our knowledge, we are also the first to provide empirical evidence, albeit correlational, concerning the belief-based channel of discrimination.

2 The model

The society consists of individuals i of two types \(s_i\in \left\{ A,B\right\} \). The types A and B form social groups based on visible characteristics which do not influence performance (e.g., race, gender). Individuals of type A and B belong to the infinite sets \({\mathcal {A}}\) and \({\mathcal {B}}\), respectively.Footnote 3 Individuals have an ability \(a_{i}\), where a is distributed uniformly over [0, 1]. Ability \(a_{i}\in [0,1]\) reflects productive capacity and is perfectly observable to all. For sake of simplicity we are dropping the index i in what follows.

Those referred to throughout the paper as “individuals” can opt to engage in one of the two possible kind of activities in the economy, which we denote by \(L\in \{0,1\}\): First, either enter the market that involves establishing interlinkages with “principals” in the economy (\(L=1\)); or, second, work in markets not requiring interlinkages with other principals (\(L=0\)). In case individual i of type s decides to enter the market not involving the establishment of interlinkages, she earns a net income on her activity equal to her ability given by:

$$\begin{aligned} W=a. \end{aligned}$$

On the other hand, if the individual enters the market requiring interlinkages, she is matched with two principals denoted by \(P\in \left\{ p_{1},p_{2}\right\} \), and earns a gross income equal to:

$$\begin{aligned} W= {\left\{ \begin{array}{ll} \lambda a &{} if ~c_{p_{1}}=c_{p_{2}}=1 \\ a-\delta &{} if~ c_{p_{1}}=0 or c_{p_{2}}=0, \end{array}\right. } \end{aligned}$$
(1)

where \(\lambda >1\), and \(c_{p_{1}},c_{p_{2}}\in \left\{ 0,1\right\} \) denote the decisions by the principals concerning whether or not to establish productive relations with the individual. The functional form in Eq. (1) exhibits an extreme form of complementarity in the actions of the principals, implying that the endeavor fails if either principal does not establish the productive relation. The intuition is that establishing a relationship with both principals is required for the individual to produce, as neither component (e.g., loan and distribution channel) can be substituted through ability. If any offer to establish the productive relation is rejected by either one of the principals, the individual faces a fixed cost \(\delta \) from the failed effort exerted in trying to form a unit and enters the market involving no interlinkages with other principals.

2.1 The static game

Individuals wanting to enter the market involving interlinkages are randomly matched with a pair of principals, \(p_{1}\) and \(p_{2}\), to try to establish productive relations. Principals have an outside opportunity of a risk free investment yielding interest r per unit invested. In case the principals decide to establish a productive relationship with the individual, they need to make an investment, which is normalized to unity. If one principal accepts an offer which the other principals rejects, then he is not able to obtain r from the risk free investment in the given period due to his capital being bound and not yielding any interest. To establish a productive relationship, and in return for the investment in their activity by the principal, the individuals offer an amount \(\sigma \) to each of the principals as repayment for the investment.Footnote 4 In order for a principal to accept, the offer has to at least give a return equal to the outside option, that is:

$$\begin{aligned} \sigma \ge 1+r. \end{aligned}$$
(2)

Therefore, the individual with the lowest ability, which we denote as \(a^*\), who can offer a share of \(1+r\) to both principals and still earn at least a is given by:

$$\begin{aligned} \lambda {a}-2(1+r) \ge {a} ~~ \text {or} ~~ a^*=\frac{2(1+r)}{\lambda -1}. \end{aligned}$$
(3)

Therefore, the strategies and payoffs of the individuals depicted in Fig. 2 resemble a span-of-control model (Lucas 1978). The more able individuals enter the market involving interlinkages and receive greater payoffs.

Fig. 2
figure 2

Ability levels and choice of production (establish interlinkages vs. no interlinkages) with no discrimination

2.1.1 Discrimination in the static framework

We now assume that individuals and principals believe that there exists a fraction \(\varphi \) (with \(0<\varphi <1\)) of principals with a taste for discrimination. Taste for discrimination can be understood as a cost/disutility, which principals with taste for discrimination face when they decide to establish a productive relation with a B-type individual, and is denoted by the parameter \(b(>0)\). An individual is assumed to be matched randomly with the principals. Imagine a scenario in which no principal has a taste for discrimination but everybody believes that taste discrimination exists in society. An individual now knows that if she is matched with a non-taste discriminating principal, the offer has to give the principal at least an expected return of \(1+r\). Let us denote by \({\bar{c}}\) the minimum offer that a principal without taste for discrimination will accept, which is given by:

$$\begin{aligned} (1-\varphi )\sigma \ge 1+r \Rightarrow \sigma \ge \frac{(1+r)}{ (1-\varphi )} \equiv {\bar{c}}, \end{aligned}$$

that is, the offer is enough for a principal without taste for discrimination to be compensated for his fear of the other principal rejecting the offer. On the other hand, the individuals could decide to offer \(1+r+b\) to each principal and be accepted with certainty, thereby escaping any potential discrimination.

Observe that only two levels of offers are possible in equilibrium (i) \({\bar{c}}\); (ii) \(1+r+b\); and only one of the two will be made. The intuition underlying this is straightforward. All individuals and principals are atomistic, thus the optimal offer to one principal is the optimal offer to the other. Moreover, if any offer c satisfying \({\bar{c}}=\frac{(1+r)}{ (1-\varphi )}<c<1+r+b\) is made in equilibrium and is accepted, the agents could do strictly better by offering \({\bar{c}}\), as that is the lowest offer that will be accepted by a principal without taste for discrimination.

The game faced by the principals is as follows: the principals without a taste for discrimination when offered \(1+r+b\) know the offer will be accepted with certainty, even by a principal with a taste for discrimination because it gives a higher payoff than the outside option, and thus the dominant strategy is to always accept. In case of observing \({\bar{c}}\) they now have a dominant pure strategy depending on the value of \(\varphi \); if \((1-\varphi )\bar{c}>1+r\) they accept, and otherwise reject. Thus, the coordination problem faced by the principals in the game is a modified stag hunt with the difference that the principals have a belief about how likely it is that the other principal will join in a stag hunt (i.e. does not have a taste for discrimination) or will go hunting for a hare (i.e. has a taste for discrimination). Thus, the principals have a dominant action in pure strategies depending on the size of the offer and their belief.

The more interesting question arises for the strategy space of the individual. An individual wanting to enter the market with interlinkages is faced with two actions: (i) pay \(1+r+b\) to each principal escaping potential discrimination; (ii) offer \({\bar{c}}\) to each principal and face the risk of potentially suffering rejection, i.e. discrimination, when matched with a taste discriminating principal. For action (i) to be individual rational, we require the individual to be able to offer a share of \(1+r+b\) to both principals and still earn at least a (the outside option).

Let us denote by \(a^n\), the lowest ability B-type who can offer a share of \(1+r+b\) to both principals and still earn at least a. The value of \(a^n\) is given by:

$$\begin{aligned} \lambda {a}-2(1+r+b)\ge a\Rightarrow a\ge \frac{2(1+r+b)}{\lambda -1} \equiv a^n. \end{aligned}$$
(4)

Given that the believed share of discriminators is \(\varphi \), the expected probability of not meeting any discriminator amongst the two principals is \((1-\varphi )^2\), which we rewrite as \(\phi \equiv (1-\varphi )^2\). An individual will risk offering \({\bar{c}}\) if the expected payoff is greater than both (i) entering wage employment earning a, and (ii) offering \((1+r+b)\) to both in order to escape discrimination with certainty, earning a gross income of \(\lambda {a}\).

Let us denote by \(a^d\) the lowest ability type who fulfills condition (i) and is willing to risk discrimination. The value of \(a^d\) is given by:

$$\begin{aligned} \phi (\lambda {a}-2{\bar{c}})+ (1- \phi )(a-\delta )\ge a \nonumber \\ \Rightarrow a \ge \frac{\phi 2{\bar{c}}+(1-\phi )\delta }{\phi (\lambda -1)} \equiv a^d. \end{aligned}$$
(5)

The highest ability type who fulfills condition (ii) and is willing to risk discrimination, which we denote by \(a^m\), is given by

$$\begin{aligned} \phi (\lambda {a}-2{\bar{c}})+ (1- \phi )(a-\delta )\ge \lambda a - 2(1+r +b) \nonumber \\ \Rightarrow a \le \frac{2b + 2(1+r)-\delta -2\phi {\bar{c}} +\phi \delta }{(\lambda -1)(1-\phi )} \equiv a^m. \end{aligned}$$
(6)

The following proposition now outlines how discrimination affects B-types in the static framework.

Proposition 1

  1. (1)

    If \((1-\phi )\delta <2\phi (b-({\bar{c}}-(1+r))\) then discrimination affects the B-types in three ways:

    (i) the individuals in the ability range \(a^*\le a\le a^d \) now enter the market involving no interlinkages, whereas the A-types in the same ability range engage in joint production in the market involving interlinkages, and enjoy the associated surplus; (ii) B-types in the ability range \(a^d\le a\le a^m \) pay \({\bar{c}}\) and enter the market involving interlinkages at a higher cost, i.e. \(2({\bar{c}}-(1+r))\), than their A-type counterparts; (iii) B-types in the ability range \(a\ge a^m\) pay \(\sigma = (1+r+b)\) and enter the market involving interlinkages at a higher cost, i.e. 2b, than their A-type counterparts.

  2. (2)

    If \((1-\phi )\delta >2\phi (b-({\bar{c}}-(1+r))\) then discrimination affects the B-types in two ways:

    (i) those with \(a^*\le a\le a^n \) work in the market involving no interlinkages, whereas the A-types in the same ability range engage in joint production in the market involving interlinkages, and enjoy the associated surplus; (ii) the ability range \(a\ge a^n\) pay \(\sigma = (1+r+b)\) and enter the market involving interlinkages at an additional cost, i.e. 2b, compared to an equally able A-type.

Proof

  1. (1)

    First observe that the inequality \((1-\phi )\delta <2\phi (b-({\bar{c}}-(1+r))\) has an intuitive interpretation; one the one hand, the expected savings from offering \({\bar{c}}\) and not encountering discrimination with probability \(\phi \) results in savings of \(2\phi (b-({\bar{c}}-(1+r))\). On the other hand, the expected deadweight loss imposed by such a strategy is \((1-\phi )\delta \), that is, the product of being discriminated with probability \(1-\phi \) and facing the associated loss of \(\delta \). If the expected savings are greater than the expected losses individuals prefer offering \({\bar{c}}\) and might be subject to discrimination.

    Now substituting for the values for \(a^n\) and \(a^m\) from Eqs. (4) and (6) and solving for the inequality gives us: \(a^n<a^m \Rightarrow (1-\phi )\delta <2\phi (b-({\bar{c}}-(1+r))\).

    Thus, the stated condition in the proposition implies \(a^n<a^m\). This implies the highest ability type whose preferred action is to offer \({\bar{c}}\) has an ability greater than the threshold ability to offer a share of \(1+r+b\) to both principals and still earn a, that is, \(a^{m}>a^{n}\).

    Next consider the inequality \(a^m>a^d\). Substituting for the values of \(a^m\) and \(a^d\) from Eqs. (5) and (6) gives us: \(a^d<a^m \Rightarrow (1-3\phi -2\phi ^{2})\delta <2\phi (b-({\bar{c}}-(1+r))\). As \(0< \phi < 1\) this implies \((1-\phi )>(1-3\phi -2\phi ^{2})\). Thus, the stated condition in the proposition implies \(a^d<a^m\).

    Now consider the inequality \(a^d<a^n\); substituting for the expressions of \(a^d\) and \(a^n\) from Eqs. (4) and (5) gives: \(a^d<a^n \Rightarrow (1-\phi )\delta <2\phi (b-({\bar{c}}-(1+r))\). Thus, the stated condition in the proposition implies \(a^d<a^n\). In other words, the type whose preferred action is to offer \({\bar{c}}\) as compared to entering the market without interlinkages, has an ability level that does not allow her to pay \(1+r+b\) and still earn a, that is \(a^d<a^n\).

    Finally, observe that by definition \(a^*<a^d\). This is because \(a^*=\frac{2(1+r)}{\lambda -1}<\frac{\phi 2{\bar{c}}+(1-\phi )\delta }{\phi (\lambda -1)} =a^d \Rightarrow 2\phi (1+r)<2\phi {\bar{c}}+(1-\phi )\delta \), which is true by definition. Thus \((1-\phi )\delta<2\phi (b-({\bar{c}}-(1+r)) \Rightarrow a^*<a^d< a^n<a^m\) and results in the expected net payoff schedule depicted in Fig. 3. The beliefs regarding discrimination affect the B-types in this case in three ways: (i) the individuals in the ability range \(a^*\le a\le a^d \) now work in the market involving no interlinkages; (ii) B-types in the ability range \(a^d\le a\le a^m \) pay \({\bar{c}}\) and enter the market involving interlinkages at a higher cost; (iii) B-types in the ability range \(a\ge a^m\) pay \(\sigma = (1+r+b)\) and enter the market involving interlinkages at a higher cost.

  2. (2)

    If \((1-\phi )\delta >2\phi (b-({\bar{c}}-(1+r))\) is satisfied, from above, we know this implies \(a^d > a^n\) and \(a^d > a^m\). In other words, no B-type is willing to risk being discriminated; the expected losses of \((1-\phi )\delta \) arising from potentially facing discrimination and the associated deadweight loss is greater than the potential savings, that is, \(2\phi (b-({\bar{c}}-(1+r))\). For the B-types \(a>a^n\) the payoff is greater from offering \(1+r+b\) and engaging in joint production than entering the market involving no interlinkages.

    This results in the ordering \(a^d>a^n\) and the situation graphically depicted in Fig. 4. In this case, B-types are affected in two ways: (i) B-types with \(a^*\le a\le a^n \) work in the market involving interlinkages and (ii) B-types in the ability range \(a\ge a^n\) pay \(\sigma = (1+r+b)\) and enter the market involving interlinkages, though they pay the higher price to escape discrimination.\(\square \)

Fig. 3
figure 3

Ability levels and choice of market (interlinkages vs. no strategic interlinkages) with discrimination

Fig. 4
figure 4

Ability levels and choice of market (strategic interaction vs. no strategic interaction) with discrimination

A situation as in Proposition 1.1 (Fig. 3) instead of 1.2 (Fig. 4) emerges, i.e. some B-types make an offer at risk of being rejected due to discrimination, if the cost of entering wage employment after a failed attempt of self-employment (\(\delta \)) or the share of discriminators (\(\phi \)) are relatively low, and if the magnitude of taste for discrimination (b) is in an intermediate range. If it is very low, then the individual might as well offer \(1+r+b\). If it is very high, she knows that no principal without taste for discrimination will be willing to take the risk and accept. The question the individual poses herself boils down to whether it is worth taking the risk of being rejected or not.

To preview the implications of the previous two cases in a dynamic framework, picture a setting as in Fig. 3, where the B-types in the ability range \(a^d\le a\le a^m \) pay \(\sigma = {\bar{c}}\) to participate in the market involving interlinkages and, therefore, are at the risk of facing discrimination if they are paired with a discriminatory principal. This arises because the expected return from risking discrimination is greater than the expected deadweight loss. Thus, it is easy to see that if there are no taste discriminators, all their offers will be accepted, and in a dynamic model in the long run nobody would believe that discrimination exists because no offers subject to potential discrimination are ever rejected. In Fig. 4, B-types either enter the market without interlinkages or make offers that compensate the taste for discrimination. In other words, in a dynamic setting there are no offers made which could be rejected and subject to discrimination, implying there is no scope for updating beliefs and thus discrimination will persist in the long run. We formalize this intuition in what follows.

2.2 The dynamic game and the belief updating process

Taste for discrimination is assumed to arise due to a shock to the taste of a subset of principals in society at time \(t_{0}\), and a proportion \(\pi _{0}\) of principals develop a taste for discrimination equal to \(b(>0)\) against establishing a productive relation with B-type individuals.Footnote 5 We assume time is discrete and indexed by t. A principal P exits the market with exogenous probability \(\omega \) every period. A principal without a taste for discrimination always replaces the exiting principal. Therefore, the share of principals with taste for discrimination in period t is \(\pi _{t}=\pi _{0}(1-\omega )^{t}\). It is important to point out that our results rely on the assumption that it is not known to anybody in society whether or how quickly discriminators die out.

Since neither \(\pi _{t}\) nor \(\omega \) are common knowledge, decisions are conditioned on beliefs about the share of discriminators amongst principals, which are updated through observations of discrimination in the market. We assume that the event which creates a taste for discrimination results in a common initial prior among individuals and principals.Footnote 6 The common prior, denoted by \(\eta _{0}\), is modeled as having a Beta distribution, with parameters \(\alpha _{0}\) and \(\beta _{0}\) and denoted \(\eta _{0}=B(\alpha _{0}, \beta _{0})\). Moreover, we denote the density of the distribution \(\eta _{0}\) by \(\theta \). The Beta distribution gives us a density on [0, 1], which captures the beliefs held by individuals and principals regarding \(\eta _{0}\). As individuals and principals need to decide on optimal actions based on their beliefs, and all individuals and principals are assumed to be risk neutral, individuals and principals use the expected value of the Beta distribution which is given by \(E(\eta _{0})=\frac{\alpha _{0}}{\alpha _{0}+\beta _{0}}\).Footnote 7 For the sake of simplicity, we can assume that the initial prior is correct, i.e. \(E(\eta _{0}) = \pi _{0}\).

The belief updating process of principals and individuals is characterized by a standard Bayesian approach. Let \(\varphi _{t}=E(\eta _{t}) =B(\alpha _{t}, \beta _{t})\), such that \(\varphi _{t}\) is the probability that individuals and principals assign to the existence of principals with taste for discrimination b in period t. Suppose in period t there are \(k_t\) offers that could be subject to discrimination, that is they lie in the interval \({\bar{c}} \le \sigma < (1+r+b)\). Moreover, assume that \({\hat{k}}_t \le k_t\) are rejected. Now based on Bayes rule, the posterior is given by \(\varphi _{t+1}=E(\eta _{t+1})=B(\alpha _{t}+{\hat{k}}_t, \beta _{t}+k_t-{\hat{k}}_t)\).Footnote 8 Therefore, for any period \(T+1\), beliefs can be computed based on the initial prior \(\alpha _{0}\) and \(\beta _{0}\), and the history of observed cases of (non)discrimination as in:

$$\begin{aligned} \varphi _{T+1}= \frac{\alpha _{T}+{\hat{k}}_{T}}{\alpha _{T}+\beta _{T}+ k_{T}} = \frac{\alpha _{0}+\sum _{t=1}^{T}{\hat{k}}_{t}}{\alpha _{0}+\beta _{0}+\sum _{t=1}^{T} k_{t}}. \end{aligned}$$
(7)

From (7) the roles of the initial prior and observed cases of (non)discrimination become clear. The larger the value of \(\alpha _t\) (\(\beta _t\)) the more (less) discriminators are assumed to exist at time t. Also the larger these parameter values, the more inert beliefs become with less weight being placed on contemporary cases due to accumulated historic cases (or a very strong initial prior).

The channel of discrimination that we put forth works on the premise that even once all principals with taste for discrimination have died out, to discriminate against members of group B may remain to be the optimal action. Denote the first period when no principals with taste for discrimination are left in the economy by \(t^*\). Let the associated beliefs regarding the proportion of taste discriminators in period \(t^*\) be denoted by \(\varphi _{t^{*}}\). The following proposition shows under what conditions discrimination can persist even after all principals with a taste for discrimination have died out.

Proposition 2

Define \({\bar{f}}\equiv \frac{2b\,+\,\delta \,+\,r\,+\,1\,+\,\sqrt{2\delta ^2\,+\,4\delta \,+\,4b\delta \,+\,4\delta r\,+\,r^2\,+\,2r\,+\,1}}{2(1\,+\,b\,+\,r)\,+\,\delta }\).

  1. (1)

    If \(\varphi _{t^{*}}>{\bar{f}}\) then discrimination persists forever, even under the trembling hand, and manifests itself in two forms:

    (i) those with \(a^*\le a\le a^n \) work in the market involving no interlinkages, whereas the A-types in the same ability range engage in joint production in the market involving interlinkages, and enjoy the associated surplus; (ii) the ability range \(a\ge a^n\) pay \(\sigma = (1+r+b)\) and enter the market involving interlinkages at a higher cost, i.e. 2b, than their A-type counterparts.

  2. (2)

    If \(\varphi _{t^{*}}<{\bar{f}}\) then discrimination will not persist in the long run.

Proof

  1. (1)

    Consider the inequality, \((1-\phi )\delta >2\phi (b-({\bar{c}}-(1+r))\); substituting for \({\bar{c}}\) and solving for \(\varphi \) gives us \((1-\phi )\delta>2\phi (b-({\bar{c}}-(1+r)) \Rightarrow \varphi >{\bar{f}}\). Now assume \(\varphi _{t^{*}}< {\bar{f}} \Rightarrow (1-\phi )\delta <\phi (b-({\bar{c}}-(1+r))\). Hence \(\varphi>{\bar{f}} \Rightarrow (1-\phi )\delta >2\phi (b-({\bar{c}}-(1+r))\) and, therefore, the condition in the proposition implies that \(a^{d}>a^{n}\) and \(a^d > a^m\). From Proposition 1 we know that under these conditions no B-type is willing to risk being rejected. Thus, at time period \(t^{*}\) all \(a^*\le a\le a^n \) work in the market involving no interlinkages and B-types in the ability range \(a\ge a^n\) pay \(\sigma = (1+r+b)\) and enter the market involving interlinkages. Thus, from \(t^{*}\) onwards there will be no offers by a B-type made within the range between \({\bar{c}}\) and (\(1+r+b\)). Therefore, beliefs will remain frozen at the current level implying the above equilibrium will persist forever.

    Now assume that due to a trembling hand, a B-type in the ability range \(a^*\le a\le a^n\) mistakingly makes an offer of \(\sigma < (1+r+b)\), i.e. an offer potentially subject to discrimination. Given that the principals believe that \(\varphi _{t^{*}}>{\bar{f}}\), and hence even for a principal without taste for discrimination the expected payoff is lower than the outside option of earning \(1+r\) due to fear that the other principal will reject. Therefore, any principal will reject the offer and as a consequence the observed case of discrimination leads to an increase in \(\varphi _t\). Now, in addition we can even allow for a trembling hand amongst one of the principals. Given that the other principal’s belief, nonetheless, is \(\varphi _{t^{*}}>{\bar{f}}\), he will reject. Therefore, due to the complementarity the equilibrium is stable under the trembling hand.Footnote 9

  2. (2)

    Now in the previous step we learned that \(\varphi< {\bar{f}} \Rightarrow (1-\phi )\delta <\phi (b-({\bar{c}}-(1+r))\) which implies \(a^*<a^d< a^n<a^m\). The best responses of the B-type in this setting are the same as in Proposition 1.1. As a consequence, there will be \(k_{t^{*}}>0\) of B-types in the ability range \(a^d\le a\le a^m \) that pay \({\bar{c}}\) and enter the market involving interlinkages at a higher cost. Therefore, as by assumption no further taste discriminators exist, the actual cases of discrimination \({\hat{k}}_{t^{*}}\) will be zero. This implies that based on the Beta distribution, the belief in the next period \(t^{*}+1\) for meeting a discriminator is given by \(\varphi _{t^{*}+1}=\frac{\alpha _{t^{*}}}{\alpha _{t^{*}}+\beta _{t^{*}}+k_{t^{*}}} < \varphi _{t^{*}}\) because \( k_{t^{*}}>0\). Generalizing, \(\varphi _{t^{*}+t} <\varphi _{t^{*}}\) for all \(t>0\), or \(\frac{d\varphi _{t}}{dt}<0\) for all \(t >t^{*}\). Hence, at some point \(\varphi _{t}\rightarrow 0\), implying all B-types with \(a>{a^{*}}\) apply and enter the market characterized by interlinkages, wherefore discrimination does not persist in society in the long run.\(\square \)

2.3 Persistence of beliefs as collective memories

The model presented above assumes that once the equilibrium set of beliefs have been established they can persist over time. In the theoretical framework, this occurs as individuals from the discriminated group decide to opt out of the market involving interlinkages or pay a price that compensates for the taste for discrimination. Thus, no more offers to establish interlinkages are subject to discrimination and beliefs remain frozen at the current level. This leads to the question of how can the persistence of such beliefs can be rationalized in the real world? What mechanism underlies the stickiness of beliefs in such settings? We interpret the transmission of beliefs in our model as happening through intergenerational transmission of a collective memory regarding discrimination. The contemporary usage of the term collective memory can be traced back to Emile Durkheim (1859–1917), and his student Maurice Halbwachs (1877–1945), who published the seminal study titled The social framework of memory in 1925. The concept of memory has been constructed in the literature as to how the mind works in a society and how their operations are structured by social arrangements. Halbwachs argues: “It is in society that individuals normally acquire their memories. It is also in society that they may recall, recognize and localize their memories” (Halbwachs 1992, 38). Formulation of memories regarding the past are hence affected by transmission of cultural beliefs and norms in society.

Beliefs regarding discrimination can be seen to fulfill the two important criteria to be categorized as collective memories. First, events which influence the collective memory are widely documented and recorded in these societies (Griffin and Bollen 2009). Thus, the instance of racial discrimination in South Africa or the United States or caste based discrimination in the context of India and Nepal are events that have been widely recorded and recollected. Second, a consensual view of the recollected past. The presence of affirmative action policies in India, South Africa, India or the US serve as signals of the public at large concerning the need to address previous wrongs.

Beliefs regarding discrimination being transmitted as collective memory through generations can also be rationalized by economic models of cultural transmission such as in Bisin and Verdier (2001) and Dessí (2008). They show that transmission of existing beliefs by parents to their offspring would be consistent with maximizing the utility of children or preserving their cultural traits. Finally, the importance of history, culture, and past events such as discrimination in shaping today’s beliefs, behavior, and outcomes, has also been demonstrated in the empirical literature (Nunn and Wantchekon 2011; Voigtländer and Voth 2012; Alesina et al. 2013b) and theoretical literature Argenziano and Gilboa (2012). Thus, beliefs regarding discrimination could be understood as collective memories that are passed on from one generation to another, which can be remarkably stable for long stretches of time.

3 Data and empirics

As foreshadowed in our discussion in the theoretical section, we empirically investigate the market for self-employment in the US, an occupation characterized by the need to establish interlinkages across markets.

3.1 The theoretical predictions and the characteristics of the market for self-employment—a comparison

The first two theoretical predictions of our model state that the discriminated group (black individuals in the US) are, first, less likely to be self-employed, and second, enjoy lower returns from self-employment. The model also predicts that the gaps in representation and earnings between the discriminated and non-discriminated group are decreasing in ability.

To examine the gap we use the 2006 American Community Survey (ACS) provided by the Integrated Public Use Microdata Series (IPUMS) (King et al. 2010).Footnote 10 As ability is not directly observable, we use education as a proxy for ability, and classify individuals possessing a college degree or more as high ability and others as low ability. The first columns of Table 1 show the odds ratios of a logistic regression with self-employment as dependent variable, while controlling for age, age squared, gender, and state fixed effects. College graduates are more likely, whereas blacks are less likely to be self-employed.Footnote 11 The gap for blacks with a college degree is smaller, as indicated by the statistically significant coefficient of the interaction term, which is larger than unity . In the second column of Table 1, we restrict the sample to those that are self-employed and explain the log of total earnings. Again, we find a significant positive gap in earnings for those with a college degree and a significant negative gap for blacks. Notice that in line with our theoretical prediction the gap for blacks with higher ability, as proxied by possessing a college degree, is relatively smaller.

Table 1 Features of the market for self-employment—effect of social identity and ability

We next turn to the outcomes regarding the probability of having a loan application rejected, not applying for a loan due to fear of rejection, and the cost of establishing interlinkages by race, using the National Survey of Small Business Finances (NSSBF) of 1998 and 2003. The first outcome we consider is whether the probability of rejection of a loan application differs by race and ability of the applicant. In order to account for any unobservable characteristics that might be responsible for these differences in the probability of loan rejection, we account for an extensive set of measures of creditworthiness as well as firm, loan, and owner characteristics, as in Blanchflower et al. (2003), and restrict the sample to blacks and whites.Footnote 12 Column (3) of Table 1 shows that blacks are 18.7 percentage points more likely to have their loan application rejected. Also, we find that a loan rejection is less likely at the top end of the ability distribution as blacks with a college degree are only nine percentage points more likely to have their loan application rejected than a white with a college degree.

Column (4) considers whether blacks are more likely to fear rejection of a loan application and therefore do not apply for a loan. This corresponds to the theoretical prediction that individuals of the B-type in the intermediate ability range do not apply to establish interlinkages, whereas individuals of the A-type in the same ability range do so. Blacks are 15.3 percentage points more likely to fear rejection and consequently do not apply for a loan.

The final outcome variable we consider is whether the interest rate charged differs by race. This prediction refers to the B-type individuals who pay a strictly higher fee for establishing interlinkages compared to A-types with the same ability level. Column (5) of Table 1 shows that black entrepreneurs are charged 1.15 percentage points more than comparable whites, while blacks with a college degree, in contrast to the theory, actually are charged 2.07 percentage points more than a comparable white individual with a college degree. It is important to note that some of the preceding results, such as differences in participation and returns between ethnic groups in self-employment, as well as the probability of rejection of a loan application and rates charged, have already been put forth by the empirical literature dealing with discrimination and self-employment in the US (Moore 1983; Borjas 1986; Bailey and Waldinger 1991; Fairlie 1999; Fairlie and Meyer 1996, 2000; Blanchflower et al. 2003; Fairlie and Robb 2008; Blanchflower 2009). Our objective is to establish that the numerous documented features of the market for self-employment are consistent with the mechanism outlined by our model, while also documenting that additional features predicted by our theory, namely, that the gaps are smaller at the top end of the ability distribution and blacks are more likely to fear rejection, are also borne out in the data. However, the question whether the belief-based mechanism presented in Sect. 2 could be responsible for the observed outcomes remains open, as conceivably other mechanisms could reproduce the observed features. We turn to this task in the following section.

3.2 Evidence for the belief-based mechanism of discrimination

We use the General Social Survey (GSS) from 1972 to 2012 along 29 questionnaires to provide empirical support for the belief based mechanism presented in Sect. 2. Crucially, the GSS allows us to construct proxies for the belief about and taste for discrimination parameters in our model. We construct two proxies of taste for discrimination by computing the share of whites by year and region that express taste for discrimination. We define taste discriminators to be:

  1. (1)

    Whites answering “yes” to “Do you think there should be laws against marriages of Blacks and Whites?”

  2. (2)

    Whites who are “very” or “somewhat opposed” when asked “What about having a close relative marry a Black person?”

In order to construct a proxy for beliefs regarding discrimination, we take the share of the sample, for each year and region, answering the following question with “yes”:

  • “On the average Blacks/African-Americans have worse jobs, income, and housing than White people. Do you think these differences are mainly due to discrimination?”

Unfortunately, neither of these questions are asked throughout all survey years, which, depending on the specification, restricts our sample size to between 14,719 and 26,339 observations. Not decomposing by region, beliefs about discrimination among whites peak in 1985 at 45% and reach the lowest point in 2004 at 34%. Our first measure for taste for discrimination among whites declines from 39% in 1972 to 10% in 2002. The second measure declines from 66% in 1990 to 21% in 2012.Footnote 13

The usage of survey responses is susceptible to the problem that responses to delicate questions, such as those concerning discrimination, can be subject to a social desirability bias. A respondent might claim not to have discriminatory taste, which might not reflect real preferences. In order to validate that we are capturing a real trend in discriminatory taste, in Fig. 5 we plot our second measure of taste for discrimination at the aggregate level against a range of racially-motivated hate crimes committed in the US against blacks (namely the number of total victims, murder and manslaughter, forced rape, aggravated assault, simple assault, and intimidation).Footnote 14 The hate-crime statistics are obtained from the Federal Bureau of Investigation (FBI) Uniform Crime Reports for the years 1996–2012.Footnote 15 The proxy for taste for discrimination aggregated at the national level (gray dashed line) follows a downward trend closely resembling the downward trend for all racially-motivated hate crimes against blacks with correlations ranging between 0.76 for forced rape to 0.91 for murder and manslaughter. Racially-motivated hate crimes could be seen as extreme expressions of discriminatory taste, wherefore the exhibited patterns strengthen the validity of our taste for discrimination measure.Footnote 16

Fig. 5
figure 5

Racially-motivated hate crimes versus taste for discrimination

3.2.1 Self-employment

The first outcome we consider is the difference in self-employment rates between blacks and whites, or in other words, the representation of the two social groups in activities involving interlinkages. Following the theoretical framework, we estimate the probability of being self-employed as a function of ability \(a_{}\), the proportion \(\pi _{tqs} \) of principals with a taste for discrimination at time t in region q against group s, the proportion \(\varphi _{tqs}\) with beliefs about discrimination at time t in region q against group s, and a vector of individual characteristics \(X_{i}\). As a proxy for ability we use years of schooling.

The proportion of principals with a taste for discrimination \(\pi _{tqs}\) and the proportion with beliefs about discrimination \(\varphi _{tqs}\) take the value zero for white individuals, i.e. for \(s=A\). We restrict our sample to white and black respondents who are not students or retired, while assuming no differences in preferences to become self-employed.Footnote 17 We estimate a logit model and control for gender, age, age squared, and whether the father was self-employed. All specifications include time and region fixed effects.

The results of the baseline regression are reported in Table 2. In columns (1) and (2), we show that either proxy for taste for discrimination against blacks is a significant negative correlate of self-employment only as long as the proxy for belief about discrimination does not enter the model. Once belief about discrimination enters the model, either proxy for taste for discrimination becomes insignificant as can be seen in columns (3) and (4). The variable representing belief about discrimination is significant at the 1% level when paired with taste for discrimination. In columns (5) and (6), we add a race dummy for blacks to validate that unobservable characteristics correlated with being black are not responsible for the observed outcomes. The race dummy turns out to be insignificant, whereas belief about discrimination remains a significant negative correlate at the 5% level in column (5), but becomes insignificant in column (6).

Table 2 Baseline logistic regression

If we were to interpret the correlation as causal, eliminating the effects of belief about discrimination and unobservables correlated with race from the estimation in column (5), the average self-employment probability for blacks increases from 6.4 to 10.6%, which is a substantial increase of 66%. However, this is still lower than 13.4%, the average probability for whites. The remaining gap can be attributed, amongst other things, to lower levels of education and demographic factors.

3.2.2 Loan rejections, fear of rejection, and rates charged

To provide further evidence of the highlighted mechanism, we next return to the outcomes regarding loan rejection, fear of loan rejection, and the interest rates charged on loans from the NSSBF 1998 and 2003 and link them to our measures for taste and belief about discrimination by year and region derived from the GSS.Footnote 18 To account for other determinants of loan acceptance, fear of loan rejection, and interest rates charged, we include a rich set of controls in all specifications, which are outlined in the notes of Table 3.

Table 3 OLS of loan application

Columns (1) and (2) of Table 3 examine the dependent variable of whether a loan application was rejected. We introduce our proxies for taste and belief about discrimination as explanatory variables and find that belief about discrimination increases the probability of failure of the discriminated group’s attempt to establish an interlinkage. When adding up the effect of the black dummy with belief about discrimination and taste for discrimination, which both have a positive sign, blacks are still more likely to have loan applications rejected. The fact that we control for an extensive set of controls regarding credit worthiness and past firm performance, as well as a race dummy for other unobservable characteristics affecting probability of repayment, and still find proxies for beliefs about and taste for discrimination to be significant correlates of obtaining a loan suggest that explanations based on discrimination rather than credit rationing explanations as in Stiglitz and Weiss (1981) might be more relevant in the above setting.

Columns (3) and (4) consider the dependent variable of whether an individual at any point during the last 3 years did not apply for a loan due the fear of rejection despite requiring credit at that moment, and columns (5) and (6) in turn consider the rate of interest charged by the lenders to firm owners. The results are very similar to the ones observed in columns (1) and (2) of Table 3. Belief about discrimination increases the probability that a member of a discriminated group does not apply for a loan due to fear of rejection. Therefore, blacks have lower probability of even attempting to establish an interlinkage due to fear of rejection as compared to a white individual. Similarly, controlling for all other observable determinants of interest rates on loans, belief about discrimination increases the rate of interest charged to members of the discriminated group. Thus, the results presented in Tables 2 and 3 provide strong support for the belief-based mechanism of discrimination presented earlier. Though the empirical strategy does not allow us to establish a causal link, it is important to note that beliefs are a significant correlate even when we control for region and time fixed effects, a black dummy, as well as a host of other individual level characteristics, suggesting that well established beliefs about discrimination might indeed be an important factor affecting participation and returns in activities involving interlinkages. Furthermore, the fact that the black dummy is insignificant in columns (5) and (6) of Table 2 and switches signs in Table 3, suggests that indeed beliefs rather than some unobservable characteristic correlated with race could be a driving force behind differences in self-employment.Footnote 19 This is in line with the finding of Fairlie (2002), who shows that controlling for Armed Forces Qualification Test (AFQT) test scores does not significantly reduce the black-white gap concerning self-employment rates, contrary to Neal and Johnson (1996), who find that premarket skills measured by the AFQT account for most of the black-white wage gap. Therefore, discrimination could explain part of the observed differences in participation rates and returns to self-employment.

3.3 Additional stylized evidence and applications of the theoretical framework

In this subsection, we present further evidence in the form of recent findings in the empirical and behavioral literature that our theoretical framework can reconcile. We then go on to highlight how the presented framework can also be useful for analyzing issues such as the phenomenon of racial tipping points in American neighborhoods.

Alesina et al. (2013a) find that banks in Italy charge self-employed women more than self-employed men for credit. They find that characteristics such as riskiness, type of business, or differential bank choice cannot explain their result. They also find that the effect is not restricted to any particular geographical region and taste based indicators of discrimination cannot explain the observed pattern. As female-run businesses need to establish interlinkages, beliefs of banks that potential productive male links might discriminate against women, might result in banks discriminating against women too. Consistent with our theoretical model the authors find that banks discriminate more against women in sectors where men dominate, and can be interpreted as beliefs being higher about the likelihood of being matched with a discriminatory male link.

The mechanism put forth is also a plausible explanation for features highlighted in data for the market for self-employment in India and Sweden. For instance, why the Schedule Castes (SCs) and Schedule tribes (STs), the socially most disadvantaged groups in India are relatively more underrepresented in urban rather than rural areas in terms of non-farm enterprise ownership, even though discrimination is higher in rural areas (Iyer et al. 2013). Why in Sweden, one of the countries where women’s labor force participation rate is very high and only 0.4% of the male population strongly agree that men make better business executives than women, has among the lowest level of self-employment for women in the EU.Footnote 20 The fact that beliefs about discrimination are higher in urban rather than rural areas in India, and remain high in Sweden concerning women, could be an important explanatory factor.Footnote 21

Daskalova (2018) documents in a lab experiment that people who do not discriminate when making decisions individually, discriminate while making joint decisions due to beliefs about what their co-decision maker will do. Albrecht et al. (2013) find that in the lab individuals are conservative in updating their beliefs, which points to another channel through which beliefs regarding discrimination might become sticky over time and be an important determinant of outcomes for the discriminated group.

Our model is also applicable to a range of markets with strategic complementarities. The dominance of particular ethnic groups in certain professions (Greif 1989, 1993; Banerjee and Munshi 2004) might be explained through our mechanism as ethnic enclaves can help secure complementary support from other individuals and overcome coordination failures.Footnote 22 Card et al. (2008) assume that when black people move into a neighborhood, white neighbors with a distaste for blacks will change neighborhoods. Anticipating a decrease in housing prices, people without a distaste for black neighbors will also sell their property and move. We show that the presence of neighbors with a distaste for black neighbors is not required to trigger the segregating dynamics; the belief is sufficient, hence providing an alternative explanation for the phenomenon of racial tipping points in the United States.

The model can also explain the prevalence of ethnic patronage in societies. Though individuals themselves might feel that ethnic favoritism is wrong, however, might still support it, as they believe that other groups are going to engage in patronage politics and the failure to do so by their own group might result in domination by the other groups. As Posner (2005, 103) notes “Zambians with such views have little choice but to behave in accordance with the expectation that ethnic favoritism will take place. Faced with the logic of a prisoner’s dilemma, most decide that, despite their misgivings about the practice, supporting members of their own groups is preferable to not doing so and being dominated by people from other groups that do.” In a similar vein Van den Berghe (1971, 515–516) explains how “Universal expectations of “tribalism” can lead to a systematic interpretation of others’ behaviour as “tribalistic,” which in turn produces pre-emptive “tribalism,” and the latter further reinforces the expectation. Expected and actual behaviour feed on each other in the classic self-fulfilling prophecy.” In such a setting ethnic favoritism comes to prevail not because individuals champion it, but rather because they fear the consequences of not doing so.

4 Policy considerations

The belief-driven gridlock put forth by the model, in which discrimination can persist in equilibrium and leaves everybody worse off, provides opportunities for affirmative action to move the economy to the “good” equilibrium as a focal point in the coordination game. The analysis is restricted to the long run equilibrium where no taste for discrimination remains, but discrimination persists due to beliefs.

Provision of financial subsidies to the B-types with sufficiently high abilities to become entrepreneurs, but who are being discriminated upon, is a potential remedy. With the subsidy they could afford to pay the higher amount, such that beliefs about discrimination would be compensated and their offers would be accepted with certainty. This measure would overcome the problem that beliefs are prohibiting both, principals from accepting and individuals from applying. On the downside, this provides a solution only as long the subsidy is in place, as this solution does not change beliefs. Moreover, the welfare effect would be negative, as the additional value creation attributed to self-employment sums up to less than the subsidy.Footnote 23

Another method of achieving equality among equal A and the B-types would be to discriminate against the discriminator. By imposing a fine F on principals who reject a B-type offering the same amount as an equal A-type that has been accepted in the same period, one could target equal treatment of A and B-types.Footnote 24 This equal treatment might come at a high cost, though. If one principal interacts with various individuals in a given period, there exists the possibility that principals begin discriminating against the A-type as well in order to avoid the fine when rejecting the B-type. Imagine a principal receiving the same offer \({\hat{\sigma }}\) by two individuals with identical ability \(\hat{a}\), but of different types A and B, in the same period. Now if he accepts the A-type and rejects the B-type, he will receive \({\hat{\sigma }}+1+r-F\), assuming that the other principal accepts the A-type offer as well. This would only be rational if \({\hat{\sigma }}-F\ge 1+r\), because otherwise he would be better off rejecting the A-type, as well. Therefore, discrimination could spillover to the A-type.

By imposing lenders to give an equivalent share of credits at similar conditions to the B-type, as observed over past periods to the A-type, lenders would be forced to accept offers by the B-type. This share would have to be benchmarked by total lending in the past conditioned on economic indicators, in order to avoid discrimination against the A-type. This measure by itself would not be sufficient, though, as individuals of the B-type would continue not to apply and distributors would continue to reject out of fear of discrimination. This intervention would have to be communicated publicly, such that it would serve as a signal and would spillover to the beliefs of the B-type and the distributors. To see this in terms of our model, imagine the government announcing publicly and credibly the implementation of this measure. Now there would be no reason for the distributor or the individual of the B-type to assign \(\varphi >0\). The great advantage of this intervention would be that intervening in one market would be enough to correct beliefs in other markets. Once the measure were to be removed, beliefs about discrimination would have vanished and no further discrimination would take place (assuming no taste for discrimination). Of course the functioning of this intervention hinges on the assumption that an individual only requires two principals. In an economy with n principals the government would have to intervene in \(n-1\) markets.

A further possibility to overcome the coordination failure would be the creation of an institution acting as coordination device providing the service of linking pre-screened non-discriminatory lenders and distributors to able B-types wanting to become entrepreneurs. As this could even be a profitable exercise such institutions might automatically arise and be provided by the market itself.

In the above we saw that schemes, such as subsidies or equal treatment regulations, might only address the problem myopically or, even worse, have undesirable consequences (like discrimination of A-types in equilibrium).

5 Conclusion

In this paper we show that even once taste for discrimination and statistical discrimination cede to exist, discrimination can persist due to remaining beliefs making discrimination the best-response, a much weaker condition than traditionally assumed in the literature.

The theoretical mechanism put forth is relevant for markets characterized by the need to establish productive relations or interlinkages with others in order for the production process to be carried out. It is shown that in such markets the presence of beliefs regarding the existence of taste discriminators, even when no more taste for discrimination exists, can result in discriminatory behavior in equilibrium. Discrimination arises as a rational response to the belief that others might discriminate, which would impose losses due to the complementarity in the production process. The model exhibits lower participation and payoffs for the discriminated group in markets characterized by the presence of interlinkages.

Empirical evidence in support of the theoretical framework is provided by analyzing the market for self-employment, a market characterized by the need to establish productive relations to be able to operate and be successful. The outcomes predicted by the model, namely lower participation rates, income, and success in establishing interlinkages for the self-employed from the discriminated group, as well as the cost of establishing interlinkages being higher, are confirmed in the data. Using the General Social Survey 1972-2012 of the US we create proxies for taste and beliefs regarding discrimination. We validate that the downward time trend of our proxies of taste for discrimination do not necessarily reflect a social desirability bias, as the proxies are strongly correlated with the time trend of racially-motivated hate crimes against blacks. A simple logit model reveals that beliefs about discrimination are a significant negative correlate of self-employment for blacks. Furthermore, we show that beliefs about discrimination also predict the rejection of a loan application, fearing rejection and therefore not applying, as well as the interest rates charged.

The nature of discriminatory coordination failures does not allow market forces to overcome discrimination and may require alternative policy tools. The various mechanisms through which discrimination manifests its dynamic linkages in terms of cross market and intergenerational effects, and the tendency to persist through cumulative and belief-based channels, need to be understood and explored in order to develop policies aimed at eradicating discrimination and achieving equal treatment and opportunities.