Ruled by robots: Preference for algorithmic decision makers and perceptions of their choices

As technology-assisted decision-making is becoming more widespread, it is important to understand how the algorithmic nature of the decision-maker affects how decisions are perceived by the affected people. We use a laboratory experiment to study the preference for human or algorithmic decision makers in re-distributive decisions. In particular, we consider whether algorithmic decision maker will be preferred because of its un-biasedness. Contrary to previous findings, the majority of participants (over 60%) prefer the algorithm as a decision maker over a human—but this is not driven by concerns over biased decisions. Yet, despite this preference, the decisions made by humans are regarded more favorably. Participants judge the decisions to be equally fair, but are nonetheless less satisfied with the AI decisions. Subjective ratings of the decisions are mainly driven by own material interests and fairness ideals. For the latter, players display remarkable flexibility: they tolerate any explain-able deviation between the actual decision and their ideals, but react very strongly and negatively to redistribution decisions that do not fit any fairness ideals. Our results suggest that even in the realm of moral decisions algorithmic decision-makers might be preferred, but actual performance of the algorithm plays an important role in how the decisions are rated.


Introduction
I would never judge you.I do not belong to any country or religion.I am only out to make your life better.
-GPT-3, a text-generating algorithm, In an essay for The Guardian1 Algorithms and Artificial Intelligence (AI) have become an integral part of our decision making, not only for personal but also for important professional decisions.Managers increasingly rely on algorithmic aids when determining how to assign a bonus or other performance incentives, who should return to work at the office after the pandemic and when deciding whom to hire and what salary to offer them (Fisher, 2019;Grensing-Pophal, 2021;Riberolles, 2021;Van Esch et al., 2019).While individual companies and managers can decide if they want to rely on digital technology for these decisions, and determine how much weight to give to their suggestions, those affected by the decisions cannot (directly) determine if AI decision supports are used.Yet, they may perceive or react differently to a decision depending on whether it was taken by a human or an algorithmic decision maker.As the number of possible applications of technology grows, which offer clear advantages in terms of operational efficiency (Solow, 1957;Stiroh, 2001), we consider a set of important, yet easy to overlook questions: Would those who are affected by the decision prefer an algorithm or a human to make a decision?How will the nature of the decision maker affect the perception of the decision?Will people react to the decisions differently if they come from a human manager as compared to an algorithm?Specifically, we consider these questions in the context of income redistribution.
Redistributive decisions taken on behalf of others represent a wide range of common situations, both in the workplace (how to assign a bonus for a team task or who gets an undesirable task) and in the economic and political context in general, ranging from taxation to social support, unemployment benefits, education policies, monetary policies and many more.These types of decisions are especially interesting for the question whether people would want or accept an AI decision maker.Unlike calculation and prediction tasks, where algorithms are widely employed and accepted (see, e.g., Humm et al., 2021), there are no objectively correct solutions in such scenarios.In this sense, redistributive decisions can be seen as a type of moral decisions, where the definition of correct or fair depends on the observer's personal ideals and beliefs.As a consequence redistributive decisions often spark controversy and lead to societal tensions and conflicts (e.g., Sznycer et al., 2017;Wakslak et al., 2007).Defining which decision maker is preferred and whose decisions are perceived to be fairer can potentially improve the acceptance of such decisions or policies, and with it, the compliance.
The nature of the decision maker might affect the perception of the decision, the acceptance and potentially also the compliance with this decision-independent of the decision itself.In a managerial context this can severely impact the performance of the affected workforce.A good illustration of this link can be found in Bai et al. (2020).The authors conducted a field experiment in an Alibaba warehouse where pick lists (i.e., lists of items the workers need to collect from different shelves in the warehouse) were either distributed by a computer terminal or a human manager.The group that received pick lists from a computer terminal perceived them to be more fair which led to a striking increase in picking efficiency of almost 20%.If this holds true in other domains of decisions made for others, the the utilization of the most widely accepted type of decision maker will increase satisfaction with the decision or policy and, by extension, compliance.Apart from considerations of fairness, different performance expectations might be important.For example, Strobel (2019) examines whether employees exert more effort if the threshold performance for receiving a bonus is set manually on a case by case basis or through an automatic system, and finds that the effort is significantly lower under an automated performance evaluation system.Yet, this result appears to be driven by the fact that employees expect lower performance thresholds in the automated condition and thus, expected to receive a bonus for a lower effort.
The newly emerging, but rapidly growing literature on how people perceive algorithmic decisions and engage with algorithms for certain tasks finds mixed results and provides a series of important findings for our study.First, it appears that the nature of the task matters for the preference to involve an algorithm (Hertz and Wiese, 2019;Lee, 2018;Waytz and Norton, 2014).People seem to be willing to outsource more analytical tasks to an automated agent, but are reluctant to do so with social tasks (Waytz and Norton, 2014).If algorithms are employed in "human tasks", algorithms' perceived lack of intuition and subjective judgment capabilities lead to them being judged as less fair and trustworthy (Lee, 2018) or reductionist (Newman et al., 2020).
Moral decisions received particular attention by Gogoll and Uhl (2018) and Bigman and Gray (2018).Gogoll and Uhl (2018) found that in moral decisions-those affecting third parties-people not only preferred the human decision-maker, but even punished others who chose the algorithm.The authors attribute this to a general aversion to automated decisions in the moral domain.This is corroborated by Bigman and Gray (2018), who find that this aversion holds irrespective of whether the decisions made are favorable for the affected parties.The per se aversion towards the algorithms in moral domains can be explained, for example, by the deeply ingrained belief that "human is better" (Eastwood et al., 2012), and that algorithms are dehumanizing and unethical in nature (Dawes et al., 1989).In a recent study, Hidalgo et al. (2021) presents a series of vignette studies, documenting that people judge moral actions by machines differently than identical moral actions by humans.The authors highlight the role of intentions in how decisions are perceived.
Yet, despite these negative attitudes towards algorithms in the moral domain, algorithmic decisions also have a "halo" of perceived scientific authority and objectivity (Cowgill, Dell'Acqua, et al., 2020).Suggestions coming from automated expert systems are seen as more objective and rational than identical suggestions from a human advisor (Dijkstra et al., 1998), and people tend to react less emotionally and less negatively to unfair decisions made by automated systems (Sanfey et al., 2003;Shank, 2012).The perceived fairness of the automated decisions may be additionally driven by the increased procedural fairness associated with the use of algorithms, as they act following "calculable rules" and decide "without regard for persons" (Weber, 1978, p.975 on benefits of bureaucracy).
Our study contributes to this literature by considering if people prefer a human or an algorithm to make redistributive decisions on their behalf, and how decisions made by different decision makers are perceived.Importantly, our results only indirectly speak to the discussion of whether people are generally averse to algorithms (Dietvorst et al., 2015), appreciate them (Logg et al., 2019) or even over-rely on them (for the overview of the literature see, Chugunova and Sele, 2020), as our main focus is not on people who have discretion to use or not use algorithmic aids, but on those who are affected by these decisions.
It is a priori not clear if the application of algorithms in redistributive decisions would increase or decrease perceived fairness and which decision maker would be preferred.While humans can be moral agents, they can also apply different fairness frameworks to different outcomes for their own benefit.Equipped with different moral frameworks, people can always argue that the decision that benefits themselves (or their group) has the moral high ground (Batson and Thompson, 2001;Epley and Dunning, 2000;Monin and Merritt, 2012).That is, humans can arguably better apply the ambiguous rules of moral decision making, but algorithms can coherently and selflessly stick to a programmed set of rules and thus score higher on procedural justice that may affect how people perceive the decisions (Hechter, 2013).In one sentence, AI can not change its decision at will and therefore its decisions are always (in this sense) unbiased, while humans might discriminate in somebody's favor.If people are aware of this, they might prefer an algorithm as the decision maker, even in the context of a moral, redistributive decision.
As empirical investigations of these questions require data which cannot be readily found in administrative or company records, we implement an online experiment where a decision maker can redistribute earnings from three real effort tasks between two players.The immediate analogy would be individual team members who all provided an effort for a project or a solution.Importantly, our setting allows for team members to bring different and often difficult to compare inputs to the team performance: e.g., coming up with an idea, putting long hours into implementation, or securing needed material resources.At the end of the process a team manager will decide on who gets which share of the bonus.In our experiment, participants can choose if the decision maker is an algorithm or a human and subsequently express their perceptions of how fair the decision is and how satisfied they are with the outcome.We choose three specific tasks that allow a range of "fair" distributions, depending on the fairness principle applied.This reflects the ambiguity of the right decisions in everyday working environments, and allows for a range of differing views on any decision taken.Additionally, depending on the treatment, we provide information on group affiliation, thus varying the potential bias of the decision maker.
We find an overwhelming preference for an algorithmic decision maker.Regardless of the potential bias of the human decision maker, more than 63% of participants prefer the algorithmic decision maker across treatments.Participants are less likely to choose AI if they have earned more than their opponent from effort or talent tasks, but it does not seem that the preference for the decision maker is driven by expected performance differences: the choice of the decision maker is not determined by he participant's own fairness ideals.Even though the majority of participants choose the AI decision maker, the analysis of fairness perceptions reveals that players are (slightly) more satisfied if decisions are made by a human.This result is independent of the actual redistribution decision imposed by the decision maker.As expected, losing tokens severely impacts the satisfaction with the decision and the perceived fairness.The strongest reaction, however comes from a decision that is perceived as unfair because it does not follow a consistent fairness principle (e.g., egalitarian, meritocratic etc.).This can, in part, explain the lower satisfaction with the AI decisions as these-due to technical simplicity of the employed algorithm-are more likely to be inconsistent with one principle.
We conclude that in contrast to some of the previous findings in the literature, people do not dislike algorithms in moral tasks per se.They actually prefer the AI decision maker, which is not due to a fear of discrimination but appears to be a general preference.The AI's actual decisions, however, are rated as inferior.To "live up to the expectations" and increase the acceptance of these AI decisions, the algorithm has to consistently and coherently apply fairness principles.

Theory and Hypotheses
Consider a situation where two people have each individually generated an income, which is then pooled, and a third party will decide how this pooled endowment is distributed.The primary question we ask is whether people will prefer a human or an algorithm to make this decision when it affects them.
We have discussed before that the literature finds opposing results on whether an algorithm or a human is the preferred decision maker in general.The overarching theme, however, appears to be a general "algorithm aversion" (see, e.g., the reviews by Burton et al., 2020;Chugunova and Sele, 2020) which means that a human decision maker should generally be preferred.Both of these papers also stress the importance of the decision context.Our decision context is inherently a moral one where the decisions will mostly be driven by fairness principles and beliefs.According to the literature, people should have a particularly strong aversion to algorithmic decision makers in this context (Bigman and Gray, 2018;Gogoll and Uhl, 2018).One reason why an algorithm might be preferred in such a situation, however, is if the impartiality of a human decision maker is in question.If a human is perceived to be biased or if the situation could lead to a potentially biased human decision, the objectivity of an algorithm might be preferred (Cowgill, Dell'Acqua, et al., 2020).We will test this assumption by systematically varying whether there is room for potential discrimination or not between treatments, which allows us to observe whether the mere possibility of discrimination affects preferences for the type of a decision maker.Simply put, the human has the potential for discrimination, the algorithm has not.2H1: If there is no scope for bias (i.e., a decision under the veil of ignorance), a human-decision maker will be preferred over an automated one.
H2: If there is scope for bias but the direction of bias is unpredictable, an automated decision maker will be preferred.
A decision bias can have two main forms, positive and negative.While a negative bias is what is colloquially referred to as discrimination, from which people wish to avoid the negative consequences, a positive bias would be preferential treatment and have positive (in the context of our study, monetary) effects and might be soughtafter.In the following we refer to these biases as negative and positive discrimination respectively.In the restricted context of our controlled experiment, we define negative discrimination as a reduction of earnings due to the revealed features of the affected person-specifically the choice of a painting (see the next section for details).Positive discrimination is similarly an increase in earnings due to the revealed features 3 .If we only consider the potential monetary benefits of positive discrimination, we would expect that this will lead to a preference for the human decision maker over the unbiased algorithm: H3a: Expected positive discrimination will increase the choice of a human decision maker as compared to a situation without discrimination.This view, however, neglects any form of social preferences and implies that people solely care about own outcomes.All outcome based fairness preference models (e.g., Bolton and Ockenfels, 2000;Fehr and Schmidt, 1999) describe a trade off between preferences for the individual payoff and the fairness of the distribution among all parties.If we assume that the individual payoff preferences outweigh the fairness preferences, hypothesis 3a still holds.If a person has stronger fairness preferences than monetary preferences, we should find that they would prefer the fair outcome over a potential positive discrimination, and hence would prefer the algorithm provided they believe the algorithm is more fair than the human.
H 3b : Expected positive discrimination will decrease the choice of human decision makers as compared to a situation without discrimination.
Expected discrimination should have a straightforward impact on the preference for the algorithm over a human decision maker.Firstly, the expected payoffs are decreased when there is a risk of discrimination.Secondly, if the distribution of payoffs is already equal, or the person in question starts with lower earnings than the other participant, they cannot expect either an increase in payoff or a fairness improvement from a human decision maker who they expect to be negatively biased towards them-but they might expect this from the algorithm.If the person has higher earnings before the redistribution, they would still prefer the AI to redistribute, expecting a fair(er) end result.
H3c: Expected discrimination will decrease the choice of human decision makers as compared to no discrimination.
As an additional test for the validity of these effects (3a, 3b, and 3c), we expect to not observe any significant change between a situation where there is no discrimination possible, and a situation where discrimination is possible, but not applicable in a particular situation.An example might clarify this: Assume there are two groups of people, A and B and the group affiliation is the only known identification.If the decision maker is a member of group A and the two people whose money this person is distributing are from different groups, one from A and one from B, the person from group A might expect positive discrimination, and the person from group B negative discrimination.If however, all three are from the same group, or the decision maker is from group A but the others are both from group B, no discrimination is possible.If no discrimination is expected in such a way, the distribution of choices should be the same as in a situation where there are no groups or these are not know.
Moving on to our expectations regarding how satisfied people will be with the decision ex post, we start with the obvious: money will, ceteris paribus, make people happy.
H4: The satisfaction with a decision increases with the allocated payoffs, irrespective of the decision maker.
When considering how people judge the decision of algorithms as compared to those of humans the literature is inconclusive.On the one hand, Sanfey et al., 2003 find people react less negatively if algorithms make "unfair" decisions (a similar result can be found in Leyer and Schneider, 2019).Moreover, algorithms appear to enjoy the perceived "halo" of scientific authority and objectivity (Cowgill, Dell'Acqua, et al., 2020) and their decisions may be regarded as more fair (Bai et al., 2020).On the other hand, literature documents strong aversion to the use of algorithms in the moral domain (Bigman and Gray, 2018;Gogoll and Uhl, 2018), suggesting that perceptions of algorithms may be context dependent.Newman et al. (2020) find that AI decisions for promotion and performance evaluations were considered reductionist and Longoni et al. (2019) suggest that algorithms are viewed as unable to take into account unique features of individuals in medical recommendation settings.Considering the question of how humans judge machines in a series of ethically relevant situations, Hidalgo et al. (2021) also find in a series of vignette studies that identical actions by humans and machines would be judged differently.Yet, while the literature does not agree on the direction of the effect, it agrees that the nature of the decision maker matters for how decisions are perceived and acted upon, so our hypothesis is non-directional.
H5: The nature of the decision maker affects the perceptions of fairness and satisfaction with the decision.
Whether a decision is unfair and discriminatory might be very subjective.Individual fairness principles may even shift from before to after income is earned (Luhan et al., 2019).We do expect however, that the possibility of a biased decision will, on average, reduce the satisfaction with this decision.This relies on the concept of procedural justice: if the process is fair then any outcome that resulted from the fair process can be considered fair (Rawls, 1971).
H6: Irrespective of the actual decision made, the possibility of discrimination reduces satisfaction.
Finally, Mellizo et al. (e.g., 2014) and Sausgruber et al. (2021) find the so-called endogeneity premium in different domains which states that if certain policies or institutions are chosen and not exogenously imposed, people appear to like them more.In line with this literature, we expect that having the option to make a choice will overall increase the satisfaction with a decision.Interestingly, recent findings by Gallier (2020) suggest that even if one's preference is overruled in the vote, compliance with the new rules is higher if they were endogenously chosen.
H7: Irrespective of the actual decision made and the nature of the decision maker, having a choice of the decision maker increases the satisfaction with the decision.

Design and Procedure
The main aim of the design was to create a situation where we can observe participants' preference for either a human or an algorithmic decision maker to redistribute income that they had previously generated.We incorporated the possibility of discrimination to consider whether this would increase the preference for the algorithm as an unbiased decision maker.In addition we wanted to measure, ceteris paribus, the satisfaction and the perceived fairness of the decision, depending on the decision maker, the perceived discrimination and whether the participant had chosen the decision maker in charge.

Income Generation
To start, experimental participants individually earned their initial income by completing three tasks.In each task participants earned tokens of different colors.The three tasks mimic three potential determinants of income central to major fairness theories: luck, effort and talent (Konow, 2003).
In the luck task, participants could earn 100 green tokens if the coin virtually tossed by the computer shows heads.In the effort task, participants were given 15 seconds to count the zeros in two matrices of zeros and ones for 100 yellow tokens each.In the talent task, participants earned 100 blue tokens for solving a matrix from the Ravin fluid intelligence test correctly.In the description of effort and talent tasks, participants were told that attention to detail and innate abilities respectively are of major importance for performing well. 4articipants knew that the tokens would be exchanged for cash (Euro) at the end of the experiment and that each color could vary in the exchange rate from 1 to 6 cents per token.This design feature offers two benefits.First, the separately colored tokens allow us to clearly distinguish the fairness principle behind any distributive decision.Second, the fact that the monetary value of the tokens was not known ex ante and could vary forces all participants to see all colors as equally important and not focus on single income elements or just the total number of tokens.
The distinction of earnings from effort, talent, and luck allows us to differentiate between four distinct fairness principles and related distributions of earnings (see, e.g., Konow, 2003;Luhan et al., 2019): egalitarian, choice egalitarian, meritocratic, and libertarian.While these fairness principles are not the focus of our study, the existence of an array of potentially fair behaviors and re-distributions enables decision makers to discriminate against one participant while still making a fair decision.This should make it more apparent that discrimination could potentially happen, as decision makers could hide behind fairness principles.It also allows us to observe whether the discrepancy between own fairness ideals and those of the decision maker can affect satisfaction with the decision.

Choice of a Decision Maker
In order to test our H1 on the general preference for a human decision maker or an algorithm to redistribute the earnings, we paired two participants and informed them of their own and the other person's token earnings from all three tasks.Both participants could individually pick a human or an algorithmic decision maker.In case of a unanimous choice of one decision maker, it would be implemented, in case of disagreement the decision of one participant would be chosen with equal probability.
The human decision maker (DM) was an anonymous and uninvolved third party.Participants were told that the person received the same explanation about the tasks that generated the incomes as they did.Decision makers received no other information about the two participants other than their income portfolios, or given any instructions on how to decide other than to "make a fair decision".The actions of the decision makers were not incentivized: they received a flat payment regardless of their choices.
As for the description of the algorithm, we deliberately did not reveal detailed information about the mechanics of the algorithm to keep the information status close to the real world where people are generally aware of, for example, how their satnav calculates the routes, but are not able or interested in fully understanding the mechanics of the algorithmic process.We therefore-truthfully-informed participants that the algorithm would choose a "fair distribution based on data from a survey of several hundred participants.The participants of the survey were informed about the three tasks you completed in stage 1 and then determined what a fair distribution is.The algorithm will apply these decision patterns to the group's income and determine a fair distribution".This description clearly states that the data used by the algorithm is not historic, was specifically tailored to the tasks the participants faced, and that the decision involved some transformation of the data.
To implement our decision algorithm, we conducted an online survey via Prolific.co,with 506 participants (253 male and 253 female) from the UK and Germany, all fluent in English.The survey participants were asked to determine a fair redistri-bution of tokens for hypothetical pairs of players.They saw the same tasks as in the subsequent experiment with identical explanations.The series of questions covered all initial token distributions that could occur in the experiment, with either one person earning more, or both starting with equal amounts for each task type.Based on the answers we programmed an automated decision maker.It considered if the tokens to be redistributed stem from effort, luck or talent and if participants have an equal or unequal number of tokens, after which it determined the redistribution using answers of the survey participants as probability weights.For instance, in the effort task if one participant in the pair had 100 tokens and another 0, with 76,48% probability the algorithmic decision maker would not redistribute the points, with 21.94% would split them equally among the two participants and with 0.79% probability it would redistribute all the points in favor of the participant who had zero points.
To simplify the design and further interpretation, we did not allow for continuous redistribution for either type of the decision maker: e.g., the decision makers could not transfer 1 token out of 100 to another player.The decision maker could redistribute the tokens of a certain color evenly, give them all to one of the players or keep unchanged. 5he experimental situation created a choice between an algorithm that was fairbased on the fairness principles held by several hundred people-and a human decision maker who was asked to make a fair decision.We discussed above that based on the literature generally human decision makers are preferred in situations concerning moral questions.However, if the decision maker could be biased the preference might switch to the unbiased algorithm.

Negative and Positive Discrimination
To test the role of bias as formulated in our hypotheses H2, H3a, H 3b and H3c, we introduced a source of potential discrimination for the human decision maker.Our aim was to keep this source of discrimination free from the possible confounding effects of real-world biases and use a purely lab-induced feature.We implemented a well established procedure to create minimal groups as introduced by Tajfel (1970).At the beginning of the experiment, all participants (including the human decision makers) were shown two paintings, one by Paul Klee and one by Wassily Kandinsky, and were asked to select which one they preferred.This simple choice, if revealed to others, has been shown to induce perceptions of an in-group and an out-group amongst participants, which might not be very strong, but in the absence of any other information can lead to discriminatory behavior (ibid.).The decision maker might favor his or her in-group due to, for instance, homophily (Y.Chen and S. X. Li, 2009;McPherson et al., 2001) and providing information allows for favoritism of this type-allowing for positive discrimination in the redistribution of income.By design, we do not incentivize any sort of discrimination, as the payment of the decision maker is independent of the decision, and therefore we test the lower bound of the effect.Even if the decision maker does not actually favor the members of the in-group, the introduction of the group information allows for discrimination and therefore may affect the choice of the decision maker.
Experimental Treatments Our first experimental treatment varies if the group information is revealed.In all treatments all participants choose a painting.In the first treatment (Tr1) no further mention of this was made in the experiment and this choice was not revealed to anybody.In Tr2 the information about the painting choice is revealed within the matching group: participants in the pair knew the paintings of each other and of the (potential) decision maker and knew that the decision maker would have the same information.Notes: info indicates whether the choice of the picture of all parties was revealed, Nature of DM indicates whether the decisions were taken by a human decision maker, an algorithm or whether there was a choice between the two; and the last three columns contain the number of sessions per treatment, the number of regular participants who earn points and choose a decision maker, and the number of decision makers in each session.
In addition to the question which decision maker is preferred, and whether potential discrimination alters this preference, we also study whether the participants are satisfied with the redistributive decision, and how this is influenced by their change of earnings (H4), the nature of the decision maker (H5),the perceived fairness of the decision (H6), the discrimination (H7), and the influence of having a choice (H8).
We therefore introduced three more treatments to test these hypotheses and to control for possible interaction effects: in Treatment Tr3 the decision maker was always an algorithm and in Tr4 always a human.To consider the interaction effect of endogenous choice of the decision maker and presence or absence of the group information we additionally vary if the information is revealed in treatments with exogenous imposed decision makers. 6In all treatments players could indicate on two seven point Likert scales how happy they were with the redistribution and if they considered it fair.When answering it, they saw the distribution of tokens after the redistribution, the initial distribution of tokens within the pair and the nature of the decision maker.
Table 1 summarizes our five treatments varying three parameters: (1) if participants could choose the nature of the decision maker, (2) if the group information is revealed and (3) the nature of the decision maker.
Timeline of the Experiment Fig. 1 provides an overview of the timeline in all treatments.First, all participants chose their preferred painting.Following this, participants were randomly assigned to be regular participants or decision makers. 7he regular participants received instructions for the tasks in the income generation stage and performed them (a coin toss, matrices with zeros and a Ravin matrix).The decision makers received the same instructions with an explanation that only the regular participants would perform the tasks.In the following redistribution stage, players were matched into pairs and the treatment variation was introduced, either giving players a choice of the decision maker (Tr1 and Tr2) or informing them about the nature of the decision maker (Tr3,Tr4,Tr5).In the Info treatments (Tr2 and Tr5), in addition to the information on tokens earned by each player in the pair, the information on the painting choice was revealed.The treatments were implemented between-subjects and each participant faced one treatment only.This redistribution stage consisted of six periods, effectively six repetitions with different matching groups.In all treatments participants were shown their own and the matched player's token portfolios and were informed that the tokens would be redistributed within the pair.They were asked to make a hypothetical decision on what distribution they would think was fair for their pair.The decision makers learned the token portfolio of the pair and could separately decide for each color token if it should be redistributed.The decision makers as well as players were not aware of the value of each token at this point.After the redistribution stage, regular players were shown one-by-one all six pairings in the individual periods and learned what redistribution decision was made for each period/pairing.Participants were informed (Tr1 and Tr2) or were reminded (Tr3, Tr4, and Tr5) of the nature of the decision maker, the painting choices of all involved parties (Tr2 and Tr5) and the outcome of the distribution.Participants were asked to indicate on separate seven-point Likert scales how happy they were with the redistribution decision and if they considered it to be fair.A random draw determined the payoff relevant period and the Euro value of each color of the token was revealed.Based on this information, participants were informed about how much they earned in the experiment.After learning the exchange rate for each color of the tokens, players were asked again how happy they were with the decision in the payment relevant round and how fair it was.
After the experiment was completed, players filled out the questionnaire including basic demographic characteristics, self-evaluations of trust, risk and political attitudes.Additionally, we included a shortened version of the readiness for technology scale (Neyer et al., 2012) and social justice orientation scale (Hülle et al., 2018) and asked several questions on their attitudes towards technology.These measures were selected as they might capture important sources of heterogeneity in evaluating the decisions, the perceived fairness and the preference for human or AI decision makers (Parker and Grote, 2020, see e.g., ).
the University of Hamburg using hroot (Bock et al., 2014).None of the participants took part in the experiment more than once.We conducted 17 online sessions of around 22 participants each.In total, 362 participants took part in the experiment.The sessions were gender-balanced and the average age of participants was 25.The average payment was 9.10 Euro for 45 minutes.
Due to our focus on the preference for and choice of the type of decision and how possible discrimination affects this choice, we conducted an unequal number of sessions per treatment (see Table 1).In each treatment-apart from Tr3 where the decision was taken only by the algorithm-we randomly allocated two human decision makers per session, each deciding for several pairs of regular participants.The decision makers received a flat payment of 10 Euro regardless of the decisions.

Results
We will follow the order of our hypotheses and start with the question which decision maker was preferred, i.e. chosen most frequently, before analyzing what determines whether decisions were perceived as fair and how happy participants were with the decisions taken.

Choice of the Decision Maker
Table 2 contains the absolute and relative frequencies of decision maker choices from Tr1 and Tr2 along with the p-values from non-parametric inference tests.We find an overall preference for the AI decision maker.In the absence of information on the group membership (the chosen painting), the algorithm is preferred in 63.25% of all choices in Tr1.We reject our first hypothesis that under the veil of ignorance the human decision maker is preferred.We find quite the contrary, that the AI is chosen significantly more frequently than 50% (two-sided binomial test p < 0.001).

Result 1.
We find that the AI decision maker is preferred over the human decision maker, in the absence of potential discrimination.
Revealing the information on the choice of the painting for all parties-and introducing the potential for discrimination-does not change this preference and we find an almost identical 63.89% choice majority for the AI in Tr2.This does not allow for a clear test of our second hypoteses that the general prospect of a biased decision by the human decision maker leads to a preference for the AI.The majority of participants chooses the AI, but with no significant difference in the preferred decision maker between Tr1 without and Tr2 with the possibility for discrimination (χ 2 p = 0.824), it appears that the general preference for AI decision maker is rather prevails in Tr2 (two-sided binominal test, p < 0.001).It is not the prospect of discrimination that drives this overall preferences for the AI decision maker and we reject H2.
As in our setting potential positive discrimination for one member of the pair means potential negative discrimination for the other, the aggregate result of no effect of potential discrimination could be due to the fact that choices under positive discrimination (H 3a,b ) are balanced out by choices under negative discrimination (H3c).
To consider this, we split up the sample into the three classes of potential discrimination (positive, negative, and no discrimination) and analyze the effects of each type of discrimination separately.However, within discrimination types, we again do not find any impact of potential discrimination on the choice of the decision maker.In all three cases, we observe a strong preference for the AI as a decision maker, and no significant difference to any of the other discrimination types in the information treatment (χ 2 p = 0.954) or to the treatment without information (see column χ 2 Tr1 in table 2).Irrespective of potential positive or negative discrimination, the majority of choices are for the AI decision maker and we reject our hypotheses H3a, H 3b , and H3c.As a final, non-parametric test we implement a trend test, but do not find a significant trend in our observations when ranked by order of potential discrimination (two-sided Jonckheere-Terpstra test p < 0.7784).
Result 2. Neither the potential for discrimination, nor the direction of discrimination affect the preference for the algorithm.As a next step we use a multivariate analysis to look for the determinants of the choice of decision maker.Table 3 contains the results from pooled probit regressions with robust standard errors clustered on the individual level.The dependent variable is the probability of choosing a human decision maker.We link this choice to all available information at the time when participants make the choice of the decision maker: the tokens that both participants earned from the three tasks; whether the information on the painting choices was available and the resulting possibility of discrimination; implications from the various fairness principles; and a small range of background variables form the questionnaire.We find no significant impact of the availability of the group information on the choice of the decision maker (variable Info).It suggests that both based on non-parametric and regression analysis, we can reject H2.We tested several specifications of perceived discrimination, none had a significant impact on the choice of the decision maker.In column 1, we report the result from a specification that uses three categorical dummies for types of discrimination: no discrimination (either no info provided or all participants had chosen the same picture) which serves as the base category, the possibility of positive discrimination and the possibility of negative discrimination.We also consider if people may be more or less likely to prefer human or AI decision makers depending on the differences in the earnings (i.e., token portfolios) between themselves and the paired player.In three regression specifications we use different approaches to capture differences in token earnings between the players.In column 1, we include the tokens earnings from all three tasks by the participant and their partner as individual variables.We find that only the earnings from the task requiring effort and talent have a significant impact-earning more in these tasks increases the likelihood of choosing a human decision maker.Looking at the partner's tokens, we find a significant but much smaller effect of the income from the effort task.In column 2, we use an alternative specification, calculating the absolute distance between the two participants' earnings.As none of these distances have a significant effect on the probability of choosing a human decision maker, this does not seem to reflect how participants considered the earnings when choosing the decision maker.In column 3, we replace the distance between the tokens with a simpler approach.We include a binary variable that captures whether the focal participant had more tokens of each kind than the partner.We again find a significant positive impact of the earnings from effort and talent, but not luck tasks on the choice of the decision maker.8If participants had earned more in these tasks than their partners they were more likely to choose a human decision maker.
Result 3. Having earned more in the effort or talent tasks increases the likelihood of choosing a human decision maker.Earning more than the other participant in these tasks has an even stronger impact on the choice of a human decision maker.This result could be due to the view that a human will have a higher appreciation of what was required to get these tokens and therefore will not redistribute these.Generally speaking, this would mean that participants believe that a human decision maker would hold a fairness principle that is more favorable to their (higher) earnings.We consider this in the last two columns of table 3.In column 4, we determined whether the participant would lose tokens (these would be redistributed to the other participant) if the decision maker held one of four fairness ideals (see section 3).We find no impact of this prospect of losing tokens under one of the fairness principles.However, this specification assumes that the participants are aware of these principles and mentally process the displayed earning tables in a very sophisticated way.To relax this assumption in column 5, we simplify this approach by creating a variable that counts under how many of the fairness ideals the participant would lose tokens to the partner.This variable ranges from 0 to 39 and is a simple representation of how likely it is that a fair decision maker will redistribute money away from the participant.We find that even this simple specification does not yield a significant impact on the choice of the decision maker, and we can conclude that participants consider fairness principles only in a very limited way.
Result 4. Possible fairness ideals of the decision maker and the resulting redistribution of tokens had no apparent impact on the choice of the decision maker.
In addition to these factors contributing to the choice of the decision maker, we control for the participants' age, sex, whether they are classified as technology ready, whether they are trusting, and two opinion questions on fair and unbiased decision making from our post experimental questionnaire 10 Of these six, only the two opinion questions had a significant impact on the choice of the decision maker.Fair-Just asked for a rating of who is better in making fair and just decisions, the AI or humans.As expected, the higher participants rated this ability for humans, the more likely they were to opt for a human decision maker.Unbiased recorded whether people believed that it was hard for humans to make unbiased decisions.Unsurprisingly, the more participants believed taht this would be easiy for humans, the more they picked the human decision maker 11 .

Satisfaction with the Decision and Perceived Fairness
To consider if people may react differently to a decision if it comes from a human as compared to an algorithm we need to disentangle potential differences in the decisions of different types of decision makers and the effect of the nature of the decision maker.Unsurprisingly, human decision makers and the algorithm make different decisions (see Appendix A.1 for more details).While the difference in performance cannot affect the choice of the decision makers in Tr1 and Tr2 ex ante, it is likely to affect perceptions of fairness and satisfaction.People reported their satisfaction and their rating of how fair a particular redistribution decision was on two separate, 7-point Likert scales.
To consider what factors affect the fairness and satisfaction rankings of the participants, we run several specifications of a pooled OLS regression with standard errors clustered at the individual level.Pooled OLS allows us to utilize the data from all the treatments and see if some invariant characteristics such as age or gender of the participant or treatment features (availability of choice or information of the picture choice) may affect the fairness and satisfaction ratings.See the results of the pooled OLS in Table 4.
In both regressions we controlled for several parameters of the decision situation.For each type of tokens, we consider if it increased or decreased after the redistribution as compared to the initial earnings (variable Before-After ), if the overlap between own fairness ideals and those of the decision maker matters (variable Hyp-Actual ) and, in line with several fairness theories (see, e.g., Bolton and Ockenfels, 2000;Fehr and Schmidt, 1999), the difference in tokens between the players after the redistribution (Own-Partner (After)).Additionally, we introduce dummy variables that capture whether the decision maker was human, if the player lost tokens in total, and what type of discrimination (positive/negative or none) is the player facing in the round.Importantly, we add a dummy variable for unfair redistribution (DV:Unfair).This dummy captures whether the implemented redistribution does not correspond to any of the major fairness principles and therefore may be regarded as inconsistent.When submitting their rankings, participants could leave additional comments in an open text field.From these comments, we can see that the participants are well aware of different fairness ideals and are ready to tolerate them even if they do not coincide with their own view.For instance, participants had no objections if no redistribution at all or egalitarian redistribution (i.e.all points split equally) were implemented, even if they themselves would have redistributed points differently.Yet, it appears that players were unhappy in cases where the decision mixed several justice principles.If, for example, points earned by talent or effort were redistributed but not those earned by luck, or if effort points were redistributed, but not the ones earned by talent. 12 From the comments, it appears that as long as one fairness principle was followedeven if participants held a differing view or ascribed to a different principle-this did 11 Although these variables seemingly capture very related concepts; the correlation between them is ratehr low and insignificant..
12 Two representative comments: "Both players get the same amount.Even if it is not a complete coincidence how the points were distributed initially, I find it fair.But I would also have understood if the "better" player would have gotten more." "There was no redistribution of the points determined randomly by coin toss.The points based on skill and concentration, on the other hand, were partially awarded to me without compensation.Even though I benefit from this, I do not feel this distribution is fair to Player B." As not all the participants left comments they are not suitable for any systematic text analysis.not lead to unhappiness or perceived unfairness of the decision.As probabilities for the decisions of the algorithm were drawn independently per task category, it mechanically followed that the algorithm ended up being less consistent with the applied principles: 12% of AI decisions were inconsistent, i.e., not following one principle, as compared to only 5% of human ones (t-test, p < 0.001).In total 9.5% of all redistribution decisions were classified as inconsistent (DV:Unfair equals to 1).
In the pooled OLS specification, we additionally control for age, gender, level of trust and if the treatment featured choice option or information.Tr1 and Tr2 allow for a fixed effect OLS specification to consider within-subject variation for participants exposed to both types of decision makers.We first report the results of the pooled OLS and then discuss additional insights that stem from fixed effect OLS estimation.The results are discussed below and are consistent regardless of the approach.
We consider fairness and satisfaction rankings separately.While the two are highly correlated (0.77, p < 0.001), they are not identical, which explains why regression results vary slightly.Many participants, however, differentiated between the two concepts, noting that they got more money and therefore they are more satisfied although they find the decision unfair13 .
We start by discussing results of the pooled OLS specification (Table 4).By far the largest in magnitude is the coefficient of DV: Unfair, that captures if the redistribution decision inconsistently combines several fairness principles.When the redistribution is inconsistent, the satisfaction with the decision is reduced by 0.76 points and the perceived fairness by 1.38 points which, for a 7-point scale, correspond to appriximately 10% and 20% decrease respectively.We additionally consider if reactions to inconsistent decisions depends on the nature of the decision maker, yet the respective interaction term is not significant as reported in table 7. Comparing the determinants for fairness and satisfaction ratings, we observe some "flexibility" in the notion of fairness in our participant sample-again in line with the comments from the open comment fields.We see that a deviation from the participant's own fairness ideals did not have a negative impact on their fairness rating but slightly affected satisfaction with the decision for luck and talent tasks (Hyp-Actual ).
Result 5. Participants accept fairness principles differing from their own as fair.The main impact for satisfaction and perceived fairness overall is whether fairness principles are applied consistently.
The factor with the second largest impact on the perceived satisfaction and fairness of a decision is a loss of tokens (DV:Lost Tokens).The dummy variable takes a value of one if the total number of tokens (regardless of their color) is smaller after the redistribution than before.When we split this up into the changes in token holdings in individual colors, we find that changes in income from effort and talent are seen as more important than those from luck (Before-After ), which is in line the the literature (e.g., Luhan et al., 2019).These token changes do not significantly impact the perceived fairness of the decision, however.This might be due to the fact that the mere change could be increases as well as decreases, so the fairness effect cannot be determined in general.Ending up with more tokens of any color than the opponent also improves satisfaction and perceived fairness, confirming a somewhat self-serving bias in terms of fairness, where participants appear to accept having more than their counterpart as being fair.We cannot reject our H4 as we find a strong and significant impact of the changes in tokens on the satisfaction with the decision.It does not appear that there is a significant interaction effect of the lost tokens with the nature of the decision maker (see specifications 3 and 4).

Result 6.
Changes in tokens strongly affect the satisfaction with a decision, but only the general loss of tokens reduces the perceived fairness.The nature of the decision maker does not additionally contribute to the effect.
Comparing treatments with a choice (Tr1 and Tr2) and with exogenously determined decision makers (Tr3-Tr5) we can conclude that being able to choose the type of the decision maker does not contribute to satisfaction with the decision or its perceived fairness (variable DV:Choice), which is a direct rejection of H8 Result 7. We do not find a significant impact of the ability to choose the decision maker on the satisfaction or perceived fairness of the decisions.
Providing the information on the group affiliation (variable DV:Info)-and therefore introducing the possibility of discrimination-significantly reduces both the perceived fairness and the satisfaction with the decision by about a third of a point.Importantly, in case of fairness, it does not matter if the player expects the group affiliation to be to his disadvantage or not. 14If players believe that they might be discriminated against (as the outsider) this decreases satisfaction with the decision.

Result 8.
The general possibility of discrimination significantly reduces the satisfaction with a decision and its perceived fairness.The satisfaction is lowest if negative discrimination is expected.However, the judgment of fairness is independent of whether the expected discrimination is positive or negative.
Additionally, participants with higher levels of general trust are more satisfied with the decisions and perceive them to be more fair, which corroborates that the drop in satisfaction comes from perceived discrimination.Age negatively affects both ratings.
According to our pooled OLS regression, the decision maker being human has a slight positive effect on the satisfaction, but not the fairness rating.To consider how the nature of the decision maker may affect satisfaction and perceived fairness more closely, we run a fixed effect panel regression with standard errors clustered at the individual level for treatments Tr1 and Tr2. 15 The fixed effect regression allows us to consider subjects who experienced within-subject variation of the decision maker.The results of the panel regression can be found in Table 5 in the Appendix.We largely include the same controls as in a pooled OLS specification if they were time variant.In line with the findings of Gogoll and Uhl (2018) and Bigman and Gray (2018) we observe that if a moral decision is made by a human decision maker it is rated as about a quarter of a point more fair and participants report a higher satisfaction (DV:DM Human).We therefore fail to reject our H5 on the decision maker's nature and the impact on satisfaction and perceived fairness: Result 9.If the redistribution decision is made by a human decision maker, it is perceived as more fair and leads to higher satisfaction.
Additionally, in the fixed effect regression we can consider how getting the participant's desired decision maker contributes to the ratings.In our experiment, if two players in the pair disagreed on the type of the decision maker, which happened in 21.6% of cases, one of the choices was randomly implemented.Therefore we have a subset of players who preferred algorithm but received human decision maker and vice versa.We do not detect any effect of receiving the decision maker one preferred on the fairness and satisfaction ratings (variable DV:Preferred DM in table 5).
Result 10.Receiving the type of decision maker that one voted for does not affect ratings of fairness and satisfactions of the decisions made by the decision maker.

A Matter of Principle or Matter of Money?
During the experiment participants earned tokens of different colors.They only knew that the exchange rate for each token into Euro was between 1 and 6 cents per token and that different colors had different exchange rates. 16After the exchange rate was announced, participants saw the payout relevant round again and were asked again how satisfied they are with the decision and how fair they find it.This feature of the experiment allows us to consider if the ratings are driven by meeting fairness principles or by own monetary interests.To do so, we compare fairness and satisfaction ratings for a certain decision before (i.e., ratings submitted for this decision initially) and for the same decision after the exchange rate was revealed.
Ratings with and without information on the exchange rate are highly correlated (satisfaction: 0.77, p < 0.001; fairness: 0.81 p < 0.001).Participants adjusted their evaluation in both directions.Over all treatments the fairness score decreased by 0.1 point (SD=1.2) and satisfaction score by 0.05 points (SD=1.32)after the exchange rate was revealed.To consider if there are factors that affect the adjustment in a systematic manner, we use an OLS regression (see Table 6 with final adjusted fairness and satisfaction scores as a dependent variable and a series of controls.)Initial satisfaction and fairness ranking submitted for the particular decision before the exchange rate was known is unsurprisingly a strong predictor of the final rating and shows that participants do not revise their ratings randomly.Our specification can explain almost 70% of the variation, yet not many controls are significant.Both satisfaction and fairness are affected by monetary outcomes: the more participants earned in monetary terms, the more they increased their satisfaction and fairness rankings.Additionally, the satisfaction with the outcome decreases if participants would have earned more without any redistribution or if they would have earned more under their own hypothetical redistribution decision.Ceteris paribus, women decrease their ranking significantly more than men.We find again that an inconsistent mix of several fairness principles reduces the final satisfaction by an additional 0.44 points.As expected, the nature of the decision maker and other controls do not affect the adjustment.
Result 11.Learning the monetary values of the tokens significantly impacts the perception of the decisions.Higher monetary values lead to an increased satisfaction and higher fairness ratings, even though the relative positions and quantitative re distributions have not changed.

Discussion and Conclusion
We study whether people prefer a human or an algorithm to decide on redistributing their earnings and analyze the impact of discrimination on this preference.We examine how the nature of the decision maker affects the perceived fairness of the decision and the satisfaction with it.The question is motivated by increased use of automated decision systems in domains beyond analytical and predictive tasks and draws attention to the use of algorithmic decisions in a wide range of applications, from policy making to determining job routines and wider management issues.It is important to determine the preferred decision maker and the impact of their type on how decisions are perceived, because ultimately this will impact not only general satisfaction of the people affected but also their reactions to and their compliance with these decisions.If one type of decision maker is perceived to make fairer decisions, for example, this will lead to wider acceptance and implementation of these decisions.In short, who is the preferred decision maker might determine who should make the decision.We find that the algorithmic decision maker is strongly preferred, even in a "moral" situation.
Algorithmic advice or decision support systems have become a common tool in managerial decisions ranging from small frequent choices such as allocating daily tasks and shifts to tasks with important consequences such as hiring new employees, distributing bonus payments, and even promotions.Given the number of tasks of these type in the workplace and the fact that technological advances already allow to automate many of them, it is important to establish whether people affected by these decisions prefer a human or an algorithmic decision maker and the influence that the type of decision maker has on how the decision is perceived.Apart from the direct effect on the employee satisfaction, perception of decisions might potentially affect performance (Bai et al., 2020;Strobel, 2019).Indeed, for example, a recent survey of the use of digital tools for HR management documents that while companies are aware of the benefits that digital HRM tools may bring, they are also concerned about how they would be perceived by the employees (Chugunova and Danilov, 2022).
Our study contributes to the large and burgeoning literature which considers the multitude of questions that arise with the advancement of technology.Does it lead to more equality and equal treatment (Tucker and Yu, 2019) or on the contrary only deepens existing inequality through digital divide (Warschauer, 2003)?How should algorithms be designed (D.Li et al., 2020) and what values should be integrated in them (Awad et al., 2018)?Do algorithms make objectively better decisions (Cowgill and Tucker, 2019)?
Our experiment provides two sets of results.First, with over 60% of participants choosing a redistributive algorithm, we find a strong preference for algorithmic decision makers.This is in stark contrast with the previous evidence that documents particular aversion to algorithms in social and moral tasks (Bigman and Gray, 2018;Gogoll and Uhl, 2018).Our findings suggest that while people may use algorithms too little themselves (Dietvorst et al., 2015), they prefer algorithms when they are affected by the decision.This preference for the algorithmic decision maker persists regardless of the potential discrimination.That is, it is not the perceived unbiasedness of the algorithm that drives the result.
Second, and somewhat in contrast to the first result, people are less satisfied with the decisions taken by the algorithm, and they judge them as less fair than human decisions.These are the same people who wanted the algorithm to make the decision.Our analysis identifies two main drivers for lower satisfaction and fairness ratings.Most importantly, decisions have to be consistent with a fairness principle.Participants react very negatively to "mistakes" of both human decision makers and algorithms if fairness principles are applied incoherently.We do not observe a difference in reactions to mistakes by humans or algorithms that was reported in the previous literature as one of the reasons for algorithm aversion (e.g., ibid.).This result leads us to believe that a more sophisticated algorithm that does not allow for inconsistencies and makes fewer "mistakes" could elicit a more positive reaction.A smaller, but nevertheless significant factor is indeed the nature of the decision maker.Decisions made by a human, irrespective of the content or consequences of the decision, are rated as fairer and they lead to a higher satisfaction.Based on a recent study by Hidalgo et al. (2021), one might speculate that it might be due to lack of intentions of algorithmic decision makers.
How can we proceed with these seemingly conflicting results that people opt for the algorithmic decision maker but do not seem to like its decisions?In our view the lessons to be learned for a management context are clear.We do not find any aversion against algorithm decisions, even in the complex "moral" domain of redistributing earnings-we find the opposite, a preference for algorithms.Disclosing the use of decision-support algorithms or even the full reliance on AI decisions should not lead to negative reactions from the affected people.Depending on the situation, highlighting the use of algorithms could increase the perceived fairness of the decisions and thus even improve acceptance and efficiency.Given that we can pinpoint the reason for the observed ex-post dissatisfaction with the algorithmic decisions, our conclusion is that an algorithm that coherently follows fairness principles would be preferred and its decisions would have outperformed the human decisions.

B Experimental Instructions for the treatment Choice Info
The experiment was conducted in German.Original instructions are available upon request.As the experiment was conducted online, the instructions were displayed to participants screen by screen.They could take as much time as needed to read them.Highlights and formatting is preserved.Subsection names indicate if the instructions were shown to all participants or only some types.

B.1 Displayed to all participants
Welcome to the experiment!You participate in a paid economic experiment.How much you will earn depends on your decisions and partly on the decisions of other participants.Therefore, it is important that you read the following explanations carefully.Your decisions in the experiment and your answers in the subsequent questionnaire are anonymous.Neither the experiment leaders nor the other participants know which decisions you made and how much you earned.
General rules of conduct.
This experiment lasts about 35 minutes.You will be paid for your participation and we ask for your full attention during the experiment.Please do not listen to music or engage in conversation with others.Please switch your phone to silent or airplane mode.Please do not read or respond to emails, or interact on social media.Your payout will be displayed at the end of the experiment.You must complete the experiment and all questionnaires to receive the payment.

*End of screen*
Which picture do you prefer?The experiment will start shortly.Please choose the picture that you like best.There is no right or wrong answer, just choose according to your taste.The chosen picture will be used to form groups during the experiment.
Please choose the picture that you like best.
*End of the screen* General procedure This experiment consists of two stages and a final questionnaire.After completing the experiment, you will be redirected to a separate secure page to enter your bank details to transfer payment.
All participants in this experiment will be randomly divided into two types, type P and type D. What you earn in these tasks will be carried over to stage 2 and determine your total payment.
During the experiment we use "points" instead of Euros.In each task you can earn points of different color (green, yellow, blue).At the end of the experiment, points of each color will exchanged for Euros with a different exchange rate.The exchange rate of each point is between 1 and 6 cents.It will be announced at the end of the experiment.
Type D participants do not take part in this stage.
In the following, each task will first be explained and then you can start working on it to earn points and money.After each task you will see the description of the next task.As soon as all participants have completed all tasks, you will see instructions for stage 2. *End of the screen* Task 1 In this task you do not have to do anything.The program will toss a virtual coin.If it land Heads, you will receive 100 green points, if Tales you receive 0 green points.
Button: Toss the coin.
*End of screen, followed by the screen with the outcome of the coin toss.* Task 2: In this task, you will see rows of "0" and "1" on the screen, as in the example below.
Please count how many zeros are displayed and enter the number in the field below and confirm the submission by pressing the button.You have 15 seconds for each screen.
To familiarize yourselves with the task and the time constraints, you will see an example screen before proceeding with the task.For solving the example screen you will not receive any points.After solving the example screen you will receive feedback and further instructions.
Button: Begin with the example *End of screen, followed by the example task* Task 2: Further Instructions.You have solved the example task (in-)correctly.
In this task you can solve two screens like that.After you submitted a first answer, a second screen will appear and you can again count the number of zeros shown on the screen and enter the result.
For each correct input you receive 100 yellow points.For incorrect entries they receive 0 yellow points.
Previous research has shown that attention and diligence are the most important factors in this task.
Button: Start the task *End of screen, followed by the two sequential screens with the task and a feedback screen that displayed how many screens they solved correctly and how many points received* Task 3 You will see a picture with 8 elements, as in the example below.
The elements in rows and columns follow a pattern.Your task is to find the element missing in the lower right corner.One of the elements 1 -8 at the bottom of the picture is the correct element.Please enter the number of the correct missing element.In the example above the correct answer would be 8.
You have 30 seconds to answer.
Previous research has shown that success in this task depends on talent.
For a correct answer you get 100 blue points.For an incorrect answer they get 0 blue points.You will now will be paired with another participant of type P. The groups are formed randomly.You will how many points from stage 1 the paired player earned and what picture did the paired participant choose at the beginning of the experiment.
The points you and the other participant earned will now be redistributed.
The new distribution of points can be determined either by a type D participant or by a decision algorithm.The points can remain with the respective participant or be redistributed.The decision will be made in points, since the value of the points will be revealed only at the end of the experiment.
Both group members can individually decide whether the algorithm or a (human) participant of type D should determine the distribution.
Type D receives a flat payment, regardless of the distribution decision.We ask type D to choose a fair distribution.This participant also sees the two sets of points that members of the pair earned in stage 1 and what picture both participants had chosen.You will also know which picture the type D participant had chosen.Type D will decide how the total number of points should be distributed between both Type P participants.The algorithm determines a fair distribution based on data from a survey with several hundred participants.The participants of the survey were informed about the three tasks they completed in stage 1 and then determined what a fair distribution is.The algorithm will apply these decision patterns to the group's income and determine a fair distribution.
First, we ask both group members to choose type D or the algorithm.If both members make the same choice, the preferred decision maker (human/algorithm) is assigned.If the two group members choose different decision makers, it is randomly determined whether type D or the algorithm determines the distribution.
While the distribution of points is being determined, we ask you to indicate what you think is a fair distribution of points.We ask for your opinion, there is no right or wrong.What you state will not be shown to any of the other participants, type P or D. What you indicate here has no influence on your payout.
Stage 2 is played through a total of six times.
In each round of stage 2 you will form a pair with another participant of type P and each time you will choose whether type D or the algorithm will determine the distribution.A new distribution will be determined each time, always starting from your initial earnings in stage 1.
After all six rounds are completed, the distribution of points in each round will be shown.You will see both the initial distribution of points and the new distribution.We then ask you to indicate for each round to what extent you are satisfied with the distribution.
At the end of the experiment, one of these six rounds will be selected and you will be paid according to the points you have after the distribution in this round.Each round is equally likely to be payed out.
Once you have read and understood all the information on the screen, please press continue.
Button: Continue *End of the screen.As participants had to wait for other players to finish reading the instructions, the instructions were repeated on the waiting screen.*

Round 1
You are in the group with the player who earn the following points.In this stage, you, as type D, do not have to make any decisions.You will decide in stage 2 how to redistribute the earnings of the other participants from this stage, an explanation of how type P participants earn money in this stage is provided.
Type P participants are given 3 separate tasks in which they can earn money.During the experiment, "points" are used instead of Euros.In each task, one earns points of one color (green, yellow, blue).How much each point is worth will be announced at the end of the experiment.
Task 1: In this task the participants do not have to do anything.The program will toss a virtual coin.If it lands Heads, they will receive 100 green points, if Tales they will receive 0 green points.
Task 2: In this task, participants will see rows of "0" and "1" on the screen, as in the example below.
Participants count how many zeros are displayed and enter the number in the field below.They have 15 seconds for each screen.After they enter the number, a new screen appears and they can again enter the number of zeros shown on the screen.
For each correct answer they receive 100 yellow points.For incorrect answers they receive 0 yellow points.
Previous research has shown that attention and diligence are the most important factors to succeed in this task.
To familiarize themselves with the task and the time constraints, participants solve a sample screen before proceeding with the task.

Task 3
Participants will see a picture with 8 elements, as in the example below.
The elements in rows and columns follow a pattern.The task of the participants is to find the element missing in the lower right corner.One of the elements 1 -8 at the bottom of the picture is the correct element.Participants of type P have 30 seconds to submit an answer.In the example, the correct answer would be 8.
For a correct answer they get 100 blue points.For an incorrect answer they get 0 blue points.
Previous research has shown that success in this task depends on talent.You will know the income of the participants from stage 1 (as in the table above) and what picture the participant had chosen at the beginning of the experiment.
For each color (green, yellow, blue) you should separately specify how many points should each member get.One point of each color is exchanged for Euros at a different exchange rate.The exchange rate of each point is between 1 and 6 cents and will be revealed only at the end of the experiment.
You can redistribute the points as you like, there is no right and wrong.Please choose a distribution that you think is fair.

Figure 1 :
Figure 1: Sequence of events in all treatments for regular participants and human decision makers.

Figure 2 :
Figure 2: By Wassily Kandinsky Type P and D have different tasks, which are explained in detail below.*End of the screen.*You are a type X participant.You will receive further explanation at the beginning of each stage.*End of the screen.*B.2 Displayed to Type P only Stage 1: Type P In this stage you will get 3 separate tasks.In each of them you can earn money.
Button: Start the task *End of the screen.Followed by the task and feedback screen.A separate screen with the results of Stage 1 that displayed chosen picture and earned points of each colour was shown.* Stage 2 Type P

*
End of the screen.*B.3 Displayed to Type D only: After the participants of type P have earned points in stage 1, they are randomly matched into pairs.As player D, they are shown the group members' earnings in points and you are now asked to determine a fair distribution of the points.You will see the income from each task in the form of a table as shown below.

Table 1 :
Features of treatments and sample information

Table 2 :
Chosen decision maker Tr2 displays the p-value for the test of differences between the discrimination classes within Tr2.Column χ 2 Tr1 contains the p-values from the individual tests of the observations in the respective row against the observations in Tr1.The final column contains the p-values from binomial tests of the observations against a hypothetical 50% frequency of AI choices (or human choices respectively).
Notes: Frequencies of choices in treatments Tr2 (AI or human decision maker with the group info) and Tr1 (AI or human decision maker without the group info).Percentages in parentheses below absolute numbers of observations.Column χ 2

Table 4 :
Determinants of satisfaction and fairness ratings.

Table 5 :
Determinants of satisfaction and fairness ratings.

Table 7 :
Continuation: Determinants of satisfaction and fairness ratings.