On the Assessed Strength of Agents’ Bias

Recent work in social epistemology has shown that, in certain situations, less communication leads to better outcomes for epistemic groups. In this paper, we show that, ceteris paribus, a Bayesian agent may believe less strongly that a single agent is biased than that an entire group of independent agents is biased. We explain this initially surprising result and show that it is in fact a consequence one may conceive on the basis of commonsense reasoning.


Introduction
Rational agents sometimes believe a conjunction more strongly than they believe every single literal in this conjunction. We show that this peculiar fact applies to Bayesian agentsin particular circumstances-and explain why.
In order to do so, we tackle the problem of how to assess a group of agents (e.g., scientists) providing testimony vis-à-vis a single agent (e.g., one scientist) providing testimony. Unlike previous works (e.g., Zollman 2013; Angere and Olsson 2017;Holman and Bruner 2015), which compared different communication structures of the same group of agents (N vs. N comparison), we here study how a group of agents compares to a single agent (N vs. 1 comparison).
Testimony consists of reports the agents provide based on their findings. The fallible agents considered here are either good inquirers; call them reliable; or not-so-good inquirers; call them biased. Intuitively, we are less likely to believe that a group of N independent agents each reporting a finding are all biased than we are to believe that one single agent providing these same N reports is biased, ceteris paribus. In other words: upon receiving the news, we assign a greater probability that at least one of the N independent agents is unbiased than we ascribe to the single agent being unbiased. We here show that this intuitive probability judgement does not universally hold true (Theorems 1 and 2) 1 and explain why this is the case.
But why is it that we judge it more likely that, ceteris paribus, one single agent is biased than that a group of independent agents are all biased? Prior to obtaining evidence, the prior probability of a single agent being reliable is equal to some value, say. The prior probability of the agent being unreliable (biased) is then 1 − =∶̄ . The ceteris paribus clause then entails that the probability of any one of N agents is biased with probability ̄ . The independence judgement then requires that the probability for all N agents being biased is ̄N . Clearly, ̄>̄N . The difference between ̄ and ̄N increases with growing N. As evidence accumulates, we have all reasons to believe that the posterior probabilities will continue to satisfy this inequality.
The probability functions considered here are those of a Bayesian agent receiving testimony from other agents (scientists). Since Bayesians agents are not prone to conjunction fallacies (holding that the probability of a conjunction is greater than the probability of a subset of conjuncts, see Tversky and Kahneman 1983) one may think that the lesson drawn from studying conjunction fallacies applies here. 2 However, we shall see that this lesson does not apply here and the intuitive answer is incorrect (Sect. 3.2).
The rest of this paper is organised as follows: next, we provide background and motivation for the area of research this paper contributes to (Sect. 2.1). Based on this exposition we introduce the formal model for our investigation (Sect. 2.2). Within the model we can formalise the Bayesian probability judgement we want to investigate (Sect. 3.1). We go on to derive (Sect. 3.2) and explain (Sects. 3.3 and 3.4) our main results and offer some conclusions regarding our immediate result and some wider implications (Sect. 4).

Background and Motivation
We consider a group of agents providing testimony for or against a hypothesis. We shall here not assume that we can fully rely on the reports provided by the agents, but instead we shall assess agents' reliability.
The Scandinavian School of Evidentiary Value conceived of unreliable agents as providing evidence which teaches us nothing about the hypothesis of interest, see further (Bovens and Hartmann 2003, 57) and Edman (1973), Hansson (1983), Schum (1988). In Bovens and Hartmann (2003), this notion of unreliability has been formalised in a Bayesian network model for determining the confirmation a body of evidence provided by a group of agents bestows on the hypothesis of interest. Their model has found applications in the philosophy of science concerning the epistemological Variety of Evidence Thesis (Bovens and Hartmann 2002;Claveau 2013;Claveau and Grenier 2019;Stegenga and Menon 2017;Landes 2020b, a), which states that varied evidence for a hypothesis confirms it more strongly than less varied evidence, ceteris paribus. Furthermore, it has been employed in Hahn et al. (2016) for modelling social debates of findings in climate science, the philosophy of economics Casini and Landes (2020) and (the philosophy of) medicine (Abdin et al. 2019;Landes et al. 2018;De Pretis et al. 2019;. Crucial to this body of work is the irrelevance of unreliable sources (Claveau 2013 calls this the IUS condition). But do we perceive of unreliable sources as providing no relevant information towards hypothesis confirmation? Collins et al. (2015; found that human subjects tend to favour the construal of unreliable sources put forward in Olsson (2011) over the approach of Bovens and Hartmann (2003), see also Merdes et al. (2021). In this approach, unreliable sources are construed as sources which always lie, i.e., the testimony of an unreliable agent is the exact opposite of what she thinks.
We are here interested in epistemic contexts in which fallible agents may be unreliable due to (possibly sub-conscious) biases. 3 We use sponsorship bias as our motivation which make agents' reports to be more likely to be in line with their sponsor's interest. Such a maximally strong bias is exhibited by agents who will always report findings in line with their sponsor's interest. Such agents are completely irrelevant for hypothesis confirmation since they provide no relevant information.
Agents which are biased to a non-maximal degree report findings with different probabilities than fully reliable agents; i.e., unbiased agents. We are here interested in biased agents that have a greater probability of reporting findings which support the hypothesis than unbiased agents and this probability is strictly less than one. That is, at times such agents do report findings which are not in their sponsor's interest. Reports from such agents do provide some information concerning the hypothesis. Reports supporting the hypothesis are (much) less confirmatory than reports from unbiased agents, whereas reports from biased agents conflicting with the hypothesis and thus with the sponsor's interest carry extra dis-confirmatory oomph. 4

The Formal Model
We adopt the Bovens and Hartmann model by only changing their formalisation of unreliable agents. To the best of our knowledge, neither the Bovens and Hartmann model nor any of its derivatives have previously been employed to compare the posterior probabilities of unreliable sources. To keep this manuscript self-contained we now briefly describe the Bovens and Hartmann model (Sects. 2.2.1 and 2.2.2) and our adaptation (Sect. 2.2.3).
is reported to hold while Rep means that the consequence fails to hold is reported. Finally, every report is modulated by a single reliability variable REL, where Rel means that the reporting agent is assessed to be reliable and Rel = Bias stands for a biased agent. Report variables representing different reports originating from the same agent thus share their modulating reliability variable. Every agent is thus represented by a single variable formalising the agent's possible types: reliable or biased.
A Bayesian prior probability function, P, defined over the algebra generated by these variables, is selected. The choice of this probability function P is constrained by conditional independencies capturing the relation of variables, which are graphically represented in a Bayesian network.

Topology of Bayesian Networks
The topology of Bovens and Hartmann networks is generated by the following modelling choices regarding probabilistic independences and dependences.
These conditional independencies-denoted by ⊥-are where n i is the reliability variable pertaining to REP i and m i the pertinent consequence variable.
The probability of whether a testable consequence is true or false is directly influenced by whether the hypothesis of interest is true or false. Similarly, the probabilities of reports that a testable consequence is reported depends on whether the relevant testable consequence of the hypothesis is true and on the reliability of the reporting agent. This motivates the edges and their orientations in such Bayesian networks; example topologies can be found in Figure 2.

Prior and Conditional Probabilities
The initial assessment of the hypothesis is expressed as the probability 0 < P(Hyp) < 1 . By initial we mean prior to receiving testimony. The initial assessment of an agent's reliability is captured by 0 < P(Rel) =∶ = 1 − P(Bias) < 1.
Consequences of the hypothesis are construed as being probabilistically entailed by the hypothesis, that is Con is more likely under Hyp than under its negation, Hyp . Mathematically speaking: 5 HYP⊥REL n for all n So far, we have been following Bovens and Hartmann (2003) from which we shall now deviate. The difference in models is explained by the different construals of unreliable (biased) agents (see Sect. 2.1) which give rise to a different formalisation. We here consider fallible reliable agents, i.e., agents who sometimes fail to report the truth. 0 < + < 1 is a reliable agent's probability of reporting a false negative (reporting that the consequence is false while it is in fact true) and 0 < − < 1 is a reliable agent's probability of reporting a false positive (reporting that the consequence is true while it is in fact false): 6,7 Intuitively, the more often an agent's testimony matches the true state of the world (truth value of CON) the greater an agent's competence. So, the smaller + , − the better the evidence an agent's testimony provides.
Agents biased in the above discussed sense are more likely to report findings supporting the hypothesis than reliable agents. That is, the probability that an agent assessed to be biased provides a report that a consequence has been observed is greater than the probability that an agent assessed to be reliable provides such a report.
In case the pertinent consequence is true, this means that In case the pertinent consequence is false, this means that We are here interested only in fallible agents and thus agents assessed to be biased commit errors of both types. Hence, neither nor can be equal to one. A possible configuration of parameters is shown in Figure 1, an overview is given in Table 1. 8 There are two types of agents in our model, reliable ones characterised by + , − and biased agents represented by , and one is unsure about each agent's type (P(Rel)). It poses no conceptual difficulty to model a situation in which agents may have multiple 0 < P(Con|Hyp) < P(Con|Hyp) < 1 for all consequence variables CON.   (2020) for more motivation and background on our way of modelling unreliable agents. 7 To streamline the exposition we suppress indices indicating the particular agent. 8 0 < < − < < 1 − + < 1 represents a biased agent more likely to report findings dis-confirming the consequence. = P(Rep|Con, Rel) = P(Rep|Con, Rel) = defines unreliable agents in the (Bovens and Hartmann 2003) sense.

P(False
types of bias and one is unsure about the type of bias a particular agent possesses. Technically, this is achieved by using variables REL of greater arity, adopting a prior over these greater arity variables and formalising different types and/or strengths of bias (Olsson 2005, Sect. 4.3). Reports (even those from the same agent) are here taken to be independent from each other given the true state of the world and given the type(s) of the agent(s) the reports are obtained from. More precisely, the probability of a report stating that a consequence of the hypothesis holds (or fails) only depends on the reporting agent and the truth value of the consequence. This models a situation in which different reports are, for example, generated by independent random tosses of the same coin or by identically sampling from the same population. A report variable hence has only two parents (a reliability variable and a consequence variable) and no children.
All these assumptions are substantial assumptions and none of them will always hold in every situation. We do not want to make the case that our assumptions are appropriate in a wide range of situations. All we rely on is that there are some situations in which our assumptions are reasonable.

Formalising the Probability Judgement
We can now return to asking the question raised in the introduction: "Ceteris paribus, do we always believe more strongly that a single agent is biased than we believe that an entire group of independent agents is biased?" As we argued in Section 1, the intuitive answer is affirmative. Before we can proceed to thoroughly answer this question we need to do two things.
First, we need to specify the evidence reports, the network structure of the reports and how the reports pertain to (the testable consequences of) the hypothesis of interest. In short, we have to specify the topology of Bayesian networks for our application. Bovens and Hartmann consider three scenarios, each scenario consists of two distinct set-ups (i.e., Table 1 Overview of employed variables, their intended interpretation and (conditional) probabilities. To increase readability, we use ¬ to denote negation in this table   Variable Intended interpretation (Conditional) probabilities

Hypothesis of interest
network topologies). We here only discuss Scenario 1 and Scenario 3. 9 In the first setup, one single agent provides all reports; in the second set-up, N agents each provide one report. See . . .
The three scenarios described in Bovens and Hartmann (2003). Set-up 2:2 is the same as Set-up 3:1 a single agent provides all the reports. Since in this situation the reports are obtained from a single agent, we use one single variable to model the (un-)reliability of this source. In the situations depicted on the right, every report is obtained from a different agent. Consequently, we use a different reliability variable for every agent to capture the (un-)reliability of all the different agents.
Second, we obviously need to make sure that conditional probabilities in both compared set-ups are, ceteris paribus, the same. So, we impose the condition that the probabilities defined in Section 2.2.3 are the same for all agents. Furthermore, we assume that for all n the n-th report in both set-ups shows the same result. Finally, we require that all consequence variables are assigned the same conditional probabilities. Mathematically, this just means that we are now not abusing notation any more when dropping a great number of indices.
The probability function for the first set-up is denoted by P 1 , the function for the second set-up by P 2 . The bodies of evidence are respectively denoted by E 1 and E 2 . Finally, we can formalise our probability judgement: "Ceteris paribus, we believe more strongly that a single agent is biased than we believe that an entire group of independent agents is biased" by

Results
We now state our main result: Theorem 1 In Scenario 1 and Scenario 3 for all 0 < P(Hyp), P(Con|Hyp) < 1 , if the following three conditions all hold then it holds that Proof All proofs can be found in the Appendix.
The answer to our question is thus no. For all probability assignments satisfying (2), we believe more strongly that the entire group of agents is biased than we believe that the single agent is biased, if all reports state that the pertinent consequence of the Bias n |E 2 .
Since there is a canonical morphism induced by switching the truth-values of binary propositional variables, one may wonder whether there is a similar such phenomenon for reliability instead of bias. Indeed, there is

Theorem 2 In Scenario 1 and Scenario 3, if then for all 0 < P(Hyp), P(Con|Hyp) < 1 it holds that
Having derived these results in our model we are next interpreting them in the setting we described. The obtained results also apply to other settings our model adequately represents. We discuss different types of biases which our model may adequately represent in Section 4.

A More Intuitive Picture
Having obtained the formal results we now know for which cases the probability of a conjunction behaves in an unexpected way. Based on this knowledge we paint a more intuitive picture of our results.
Consider a situation in which a person you believe to be unreliable tells you something you did not expect to hear. For example, the chief scientist of a pharmaceutical company publicly states that a drug they currently sell and recently researched is less effective than previously believed. Based on this information you believe more strongly that the agent is in fact reliable. Next suppose that there are a number (N, say) of chief scientists and each scientist tells you about the drug they have been exclusively selling and recently researching that their drugs are less effective than previously believed. What do you now think about the group of scientists? Your belief in their individual reliabilities has increased. This means that your belief in their individual unreliabilities has decreased. Supposing that there is no connection between the different companies, scientists and drugs your belief in all of them being unreliable decreases proportionally to the number of reports. Now suppose instead that there is a single chief scientist working for a pharmaceutical company who tells you that a number of drugs (N) sold by her company which have all recently been researched are less effective than previously thought. Let us make the picture more concrete by assuming that there is no connection between the different drugs (different research labs studying them, targeted at different diseases). To ease the comparison between this and the above set-up, we assume that the content of report i is the same in this and the above set-up. Furthermore, we suppose that all reports are all equally (un-)likely. What do you now believe about the reliability of the single scientist? Clearly, your belief in her reliability is increasing. The increase in belief in her reliability is the stronger the more you initially believed the agent to be unreliable. Furthermore, the less likely you initially believed to hear such testimony from a biased agent, the stronger the reversal of the standing of the scientists in your eyes. Note that the reports from a single person have a cumulative effect on the assessed reliability. The situation resembles the accumulation of compound interest, the increase in the assessed reliability (interest) sky-rockets.
But since an increase in the assessed reliability means a decrease in the assessed unreliability, the latter plunges very quickly indeed. It is then conceivable that, ceteris paribus, in certain cases it is the case that you believe less strongly that the single agent is biased than you believe that all scientists in the group are biased.
We next discuss the parameter values for which this unexpected behaviour of the probability of a conjunction obtains.

Explanation of Results
Since these two results are natural duals of each other, we shall only discuss Theorem 1. Why is it that one believes more strongly that the entire group is biased than that the single agent is biased? We can explain this by looking at the parameter values for which this happens. 10 We develop a deeper understanding of the first two conditions in (2) by re-writing them as This means that biased agents are strongly biased, ≫ 1 − + and ≫ − .
Holding the truth value of the CON variable fixed, we see that quotients on the left and on the right describe ratios of the likelihood of the reported findings. The literature on Bayesian statistics refers to these ratios as Bayes factors; which are-in this literatureconsidered to be the measure of the strength of evidence. Translated to our setting, this means that the received reports are strong evidence against the hypothesis that agents are biased, for large N. For N = 2 , the Bayes factors are only required to be greater or equal than four; a Bayes factor equal to three is conventionally interpreted as relatively weak evidence for a hypothesis.
The third condition, P(Rel) ∶= ≤ So, upon receiving multiple reports dis-confirming the hypothesis from a single agent, the assessed reliability of this agent sky-rockets. In turn, the assessed bias of this agent falls through the floor. The stronger the assessed bias, the closer , are to one (the closer 1 − , 1 − are to zero), the larger the "Bayes factors" in (4), the less likely one thinks that one agent consistently reports contrary to her bias. Hence, the stronger this effect.
Furthermore, the smaller the prior probability of agents being reliable, i.e., the smaller (the greater ̄ ), the more relevant the above considerations become. Hence, the stronger the effect.
Instead, if these findings are reported by a group of agents where every agent only makes one single report, then the assessed bias of every single agent decreases only somewhat. The assessed bias of the entire group hence also falls-but only moderately so.
For large enough Bayes factors, the drop in the assessed bias of the single agent outpaces the decrease of the assessed bias of the entire group of agents.

Further Observations
We also want to point out that the condition of binary report variables is unnecessarily restrictive. All results immediately generalise to report variables of finite arity, as long as the values of the received report variables satisfy the conditions in (4).
Observe that Theorems 1 and 2 apply to all , , + , − ∈ (0, 1) which satisfy (3). In particular, there is no constraint which couples and , nor is there a constraint which couples + and − . Hence, these theorems hold also for all , , + , − ∈ (0, 1) which satisfy (3), if = , 1 − + = − or ( = and 1 − + = − ) hold. In case = a biased agent is an unreliable agent in the Bovens and Hartmann-sense, in case 1 − + = − an agent assessed to be reliable in our setting is an unreliable agent in the Bovens and Hartmann-sense.
Furthermore, Theorems 1 and 2 also apply to incompetent agents with 1 − + < − and/ or < . Faced with reports from such incompetent agents, one better believe the opposite of the reported findings. One hence perceives such agents as liars in the sense of Olsson (2011).
Finally, we observe that Theorems 1 and 2 do not distinguish between Scenario 1 and Scenario 3: that is, the constraints on the probability assessments are the same for Scenario 1 and Scenario 3. This observation should calm all remaining worries that somehow the consequences of the hypothesis of interest do the heavy lifting here; they do not. This contrasts the results in Bovens and Hartmann (2003) and Osimani and Landes (2020) for hypothesis confirmation, which distinguish between Scenario 1 and Scenario 3.
Finally, we remark that we obtain the counter-intuitive result for all group sizes, N ≥ 2.

Conclusions
Recent work in social epistemology on the topology of group communications has brought the unexpected finding that sometimes epistemic groups fare better when agents (can) only communicate with few of their peers, see Zollman (2013) for an overview and Angere and Olsson (2017) for a recent point in case. Although, these results may depend on the epistemic group being comprised of honest truth-seeking agents as argued by Holman and Bruner (2015) and on particular parameters Rosenstock et al. (2017), this part of the message is loud and clear: Sometimes less is more in social epistemology. This paper replicates this message for N versus 1 comparisons. We draw two further immediate conclusions: intuitions in multi-agent settings must be appreciated with due care and formal modelling can help us discover interesting belief dynamics between epistemic notions (such as reliability, bias, group size, strength of evidence) that we are very unlikely to have discovered by any other means.
Let us for the moment switch point of view and take the perspective of the group of agents providing testimony. From here, it appears less than ideal that the entire group is perceived more strongly to be biased than a single agent. Group members may feel that the posterior probability assignment P 1 (Bias�E 1 ) < P 2 ( ⋀ N n=1 Bias n �E 2 ) constitutes an epistemic injustice (Fricker 2007) caused by overly negative prior assessments ( , ,̄ large). 11 One wonders, given that all the agents have done is to report contrary to the perceived bias, is there nothing the group can do to overcome this unfortunate state of affairs? The short answer is: no. There is nothing to be done. Once the prior probabilities are set, Bayesian updating kicks in and finishes the job.
This means that the only road to salvage the standing of the group of agents is a more favourable assessment prior to reporting. This can be achieved by either a more favourable assessment of the strength of bias (smaller , ) or by a more favourable assessment of being reliable (greater ). This then demonstrates the importance of appearances and the value of a good public relations section as well as the importance of the choice of the prior probability function in Bayesian epistemology.
We also want to point out that the employed Bayesian network models are rather versatile having found applications in judgement aggregation, varied evidence reasoning and social epistemology. Future applications await exploration. Further future work may also address inequality (1) with different notions of (un-)reliability in mind, variables of greater arity, and/or bodies of evidence containing conflicting reports. Another interesting avenue are more complicated topologies of the Bayesian network with fewer independencies (more edges), see Claveau and Grenier (2019) and Landes (2020a).
We also remark that while sponsorship bias provided the motivation for our model of a biased agent (in terms of 1 − + < and − < ), our analysis applies to all other biases (or other cognitive states) which make false positives more likely and false negatives less likely. Furthermore, in case false negatives are less likely and false positives are more likely ( 1 − + > and − > ), our analysis continues to apply after employing the canonical morphism permuting and 1 − + as well as and − . Since the list of biases is rather large (Bero and Grundy 2016; Hahn and Harris 2014) the analysis presented here may prove relevant for a variety of strands of research.
Finally, our analysis was motivated by considering agents which were either biased or reliable; agents hence had one of two possible types. The formal analysis presented here is, of course, blind to the motivation of the model. Our analysis is hence relevant to all other scenarios in which there is uncertainty about agents' types. Other instances of dichotomous types are right-wing versus left-wing, hawks versus doves (foreign policy), predator versus scavenger, authoritarianist versus anarchist and theist versus atheist.
Acknowledgements Open Access funding provided by Projekt DEAL. Barbara Osimani is the PI of the European Research Council-founded project PhilPharm and gratefully acknowledges being fully funded by the project. Jürgen Landes gratefully acknowledges funding from the European Research Council ('PhilPharm' grant 639276) and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)-405961989 and 432308570. We also want to thank Lorenzo Casini and Stephan Hartmann for many helpful discussions and comments. Many thanks also to anonymous reviewers and the editor of this journal who helped us to improve the paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.

Appendix
We first prove a technical lemma. We need to introduce a little more notation. For variables, e.g., REP, we use Rep 1 to denote Rep and Rep 0 to denote Rep . For all n we use r n ∈ {0, 1} to denote the value of the n-th evidence report. Recall that r n does not depend on the set-up, due to our above conventions. Note that we do not require that the r n are equal.

Lemma 1 In the first scenario we have
In the third scenario we have Proof For Scenario 1 we find The sign of this expression is equal to the sign of Since the first term is equal to ̄N −1 , the sign of this expression is equal to . This is equal to the sign of ◻ Theorem 1 In Scenario 1 and Scenario 3 for all 0 < P(Hyp), P(Con|Hyp) < 1 , if P(Rep�ConRel) = + ≥ 4(2 N−1 − 1)(1 − ) = 4(2 N−1 − 1)P(Rep�ConBias)

then it holds that
Proof First, observe that (1 + what is in turn equivalent to 2 N−1 ≤̄N −1 . We shall use to obtain the first strict inequality below (this implies <̄).
To complete the proof for Scenario 1 it suffices to note that The proof for Scenario 3 is analogous: rather than summing over the truth values of CON one sums over all possible combinations of truth values of CON 1 , … , CON N .

Theorem 2 In Scenario 1 and Scenario 3, if
then for all 0 < P(Hyp), P(Con|Hyp) < 1 it holds that Proof The proof is obtained from the above by a suitable dualisation: switch Rel and Bias-this includes and ̄ , as well as considering reports which confirm the consequences rather than dis-confirm them.
For Scenario 1 we find The sign of this expression is equal to the sign of Since we are only interested in the sign of this equation we consider