Skip to main content

Groupthink

Abstract

How should a group with different opinions (but the same values) make decisions? In a Bayesian setting, the natural question is how to aggregate credences: how to use a single credence function to naturally represent a collection of different credence functions. An extension of the standard Dutch-book arguments that apply to individual decision-makers recommends that group credences should be updated by conditionalization. This imposes a constraint on what aggregation rules can be like. Taking conditionalization as a basic constraint, we gather lessons from the established work on credence aggregation, and extend this work with two new impossibility results. We then explore contrasting features of two kinds of rules that satisfy the constraints we articulate: one kind uses fixed prior credences, and the other uses geometric averaging, as opposed to arithmetic averaging. We also prove a new characterisation result for geometric averaging. Finally we consider applications to neighboring philosophical issues, including the epistemology of disagreement.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    There is a rich mathematical literature on credence aggregation. Genest and Zidek (1986) provide a useful survey of the classic work on this topic. Fitelson and Jehle (2009) present more recent philosophical discussion of some of these results, in the context of the epistemology of disagreement.

    This line of inquiry is inspired by parallel results in social choice theory—beginning from Arrow’s theorem (1970), which gives an impossibility result for combining preference orderings. This family of results typically involves constraints similar to those we’ll discuss, such as Irrelevant Alternatives, Non-Dictatorship, Anonymity, Neutrality, and Unanimity. Arrow’s work has also inspired influential work on aggregating “on-off” judgments (for instance, List and Pettit 2002). There is also important work on the more general case of simultaneously aggregating credences and preferences (such as Mongin 1995; Gilboa et al. 2004)—which is not the case we are considering.

    The key difference between our work and these earlier results is the prominence we give to Conditionalization, which has no natural analogue in aggregating either preferences or full beliefs, and which (perhaps surprisingly) also has not received much attention in the credence aggregation literature. Two conditions that have received significant attention instead are (Conditional) Independence Preservation and the External Bayesian Condition, which we discuss below. But as the rules we will consider make clear, neither of these conditions are implied by Conditionalization, and so our results go beyond those which appeal to either of them.

  2. 2.

    This principle goes by a variety of names in the literature, including “the strong setwise function property”, “strong label neutrality”, and “the context-free assumption”.

  3. 3.

    This was shown independently by McConway (1981) and Wagner (1982). See Genest and Zidek (1986, p. 117).

  4. 4.

    Suppose \(\hbox{ag}\,C\) is the weighted average \(\sum\nolimits_{i} a_{i} \cdot C_{i}\) (with weights \(a_{1}, \ldots, a_{n}\)). Conditionalization tells us

    $$\begin{aligned} \sum \limits_{i} a_{i} \cdot C_{i}(B \mid A) \cdot C_{i}(A)&= \sum \limits_{i} a_{i} \cdot C_{i}(A \wedge B) = \hbox {ag}\,C(A \wedge B) = \hbox {ag}\,C(B \mid A) \cdot \hbox {ag}\,C(A) \\&= \left( \sum \limits_{i} a_{i} \cdot C_{i}(B \mid A)\right) \cdot \left( \sum \limits_{i} a_{i} \cdot C_{i}(A)\right) \end{aligned}$$

    In other words, the weighted average of products is the product of weighted averages. This only holds when every weight but one is zero. We can see this by supposing \(a_{j} \ne 0\) and looking at how the group opinion changes when we adjust the the jth conditional credence in B given A, holding everything else fixed. Consider any \(C, C' \in \mathbf{C}^n\) where \(C_{i}(A) = C'_i(A) \ne 0\) for all \(i, C_{i}(B \mid A) = C'_{i}(B \mid A)\) for all \(i \ne j\), and \(C_{j}(B \mid A) \ne C'_{j}(B \mid A)\). Applying the equation above to both C and \(C'\) and taking the difference yields

    $$a_{j} \cdot \left( C_{j}(B \mid A) - C'_{j}(B \mid A)\right) \cdot C_{j}(A) = a_{j} \cdot \left( C_{j}(B \mid A) - C'_{j}(B \mid A)\right) \cdot \left( \sum \limits_{i} a_{i} \cdot C_{i}(A)\right)$$

    Cancelling non-zero factors, \(C_{j}(A) = \sum \limits_{i} a_{i} \cdot C_{i}(A) = \hbox {ag}\,C(A)\). This can only hold generally if j is a dictator, that is, if \(a_{i} = 0\) for all \(i \ne j\).

  5. 5.

    This is also called the “weak setwise function property”, or (confusingly) “Independence”. Here is another equivalent version (McConway 1981; see Genest and Zidek 1986).

    • Marginalization: For any subalgebra \(\fancyscript{A}\) of propositions, if sequences of credence functions C and \(C'\) agree on \(\fancyscript{A}\), then \(\hbox{ag}C\) and \(\hbox{ag}\,C'\) agree on \(\fancyscript{A}\) as well.

    (The marginalization of a credence function is its restriction to a certain subalgebra. So, if you suppose an aggregation rule to be extended to give you a rule that applies to credence functions defined on the subalgebras as well, this principle amounts to saying that aggregation commutes with marginalization.) The thought is that carving up the possibilities more finely, distinguishing more specific subcases, doesn’t make any difference to the group credences in the coarse-grained possibilities.

  6. 6.

    Independence Preservation is perhaps implausibly strong to begin with. There are cases where two events happen to be independent according to each person’s credences, but intuitively it doesn’t seem important that the group preserve this. Wagner gives this example: if you think a six-sided die is fair, then you should also think that whether an even number is rolled is independent of whether a multiple of three is rolled. Suppose someone else thinks the die is weighted, but in a way that those propositions still happen to come out independent. It’s hard to attach any great importance to keeping this feature of their credences when we combine them. Genest and Wagner (1987) and Wagner (2010b) give further arguments along these lines.

  7. 7.

    This is because \(C(A) = 0\) iff \(C \mid \lnot A\) is the same as C. (For the right-to-left implication, note that \(C(\lnot A) = C(\lnot A \mid \lnot A) = 1\).) So if \(C_{i}(A) = 0\) for each i, then \(\hbox{ag}\,C = \hbox{ag}\,\langle C_{1} \mid \lnot A, \ldots , C_{n} \mid \lnot A\rangle = \hbox{ag}\,C \mid \lnot A\) by Conditionalization, and so \(\hbox {ag}\,C (A) = 0\).

  8. 8.

    This constraint has the same flavor as Pareto principles—for instance, the one in Arrow’s theorem for preference aggregation (see note 1), which says that if every individual ranks X over Y, then the group ranks X over Y as well. (For instance this is how Mongin 1995 motivates Unanimity.)

    While we’re not sympathetic to Irrelevant Alternatives, it’s worth noting as a point of logical geography that Unanimity follows from Irrelevant Alternatives together with a weaker version:

    • Weak Unanimity: If every individual has the same credences for every proposition, then the group also has those credences.

    Weak Unanimity is intuitively much weaker: it says nothing at all about what to do in cases of disagreement.

    Another point to note is that the following arguments still go through if Unanimity is restricted to apply to cases where the group has “pooled evidence” so each individual assigns the very same propositions credence one, as long as their evidence leaves open at least three worlds.

  9. 9.

    The argument straightforwardly generalizes to an even number of people, by replacing each individual with a unanimous bloc, but it is less obvious how this would go for an odd number.

  10. 10.

    In fact, no Non-Dictatorial rule can satisfy the more general version—since the more general version implies both Conditionalization and Irrelevant Alternatives (as a special case, considering ratios with a tautology).

  11. 11.

    The Radon–Nikodym theorem implies that any absolutely continuous credence function can be represented by a density function this way (see e.g. Halmos 1950, Sects. 30–31). Because of the integral formula’s similarity to the fundamental theorem of calculus, the density function f is often called the “derivative” of the measure C and is denoted \(\frac{\text {d}C}{\text {d}\mu }\).

  12. 12.

    Moss’s (2011) rule would work, too, if it’s applied to the priors. But applied to the posteriors it will violate Conditionalization.

  13. 13.

    This holds given the standard assumption that coherent credences are countably additive. If this assumption is dropped, there are Neutral credence functions—for instance, one that assigns probability one to each cofinite set of worlds, and zero otherwise. (A set is cofinite if it only leaves out finitely many worlds.)

  14. 14.

    Of course, if we restrict attention to cases where the individuals have already pooled their evidence, this version agrees with the common ground version.

  15. 15.

    Note that ordinary conditionalization amounts to the special case where \(\eta\) is zero on some set of worlds and uniform elsewhere. As with Conditionalization, this version of the External Bayesian Condition builds in the assumption that the \(\eta\)-update is well-defined for the group when it is for the individuals.

    Again, to generalize this idea beyond the discrete case, it makes sense to restrict attention to absolutely continuous credence functions; then the \(\eta\)-update is given by pointwise multiplication of \(\eta\) with the probability density function.

  16. 16.

    Wagner (2010a) proves the equivalence with Jeffrey conditionalization. See Field (1978) and Wagner (2002) for discussions of the “same evidence” issue.

  17. 17.

    This rule is attributed to Peter Hammond, who also noted the fact that it obeys the External Bayesian Condition (Genest and Zidek 1986, pp. 119–120). In a blog post (2012) Pruss makes a closely related suggestion for aggregating credences in a single proposition from individuals with the same evidence (namely, averaging the logarithm of odds), and discusses some of its nice features and an alternative motivation.

  18. 18.

    On the other hand, the Geometric Rule does obey these:

    • Pointwise Ratio Unanimity: For any pair of worlds \(w_{1}\) and \(w_{2}\), if each individual in C assigns the same credence ratio between \(w_{1}\) and \(w_{2}\), then \(\hbox {ag}\,C\) also assigns that ratio.

    • Pointwise Comparative Unanimity: If each individual assigns a higher credence in world \(w_{1}\) than \(w_{2}\), then the group does as well.

    As with Unanimity, there are natural analogies between these and the Pareto principle—though there is also a disanalogy, in that these only apply “world by world”. The Geometric Rule does not satisfy the more general versions for arbitrary propositions.

  19. 19.

    Naturally this goes for the Fixed Prior rule, too, but the details of its recommendations will vary depending on what the fixed prior is. Some versions will reject the first bet, and others will reject one of the second bets.

  20. 20.

    This result complements those of Genest (1984) and Genest et al. (1986). Genest (1984) shows that weighted geometric averaging is the only kind of rule that is Externally Bayesian and also obeys a weakened form of Irrelevant Alternatives. (Viz: the group’s probability density at a world is determined by the individual’s probability densities at that world, up to a constant normalization factor.) Genest et al. (1986) extend this result to a general characterization of Externally Bayesian operators. The most important difference between our result and these is that we do not rely on the External Bayesian Condition, but only the weaker Conditionalization principle. Also, our Neutrality and Continuity conditions are orthogonal to Genest’s Irrelevant-Alternatives-style principle. Finally, our result applies to the context of countable probability measures, rather than the more general setting of probability densities.

  21. 21.

    This turns on the fact that if the sums of two infinite sequences converge, then the sum of the sequence of their geometric means also converges. (This is clear, since this sequence is bounded by the pointwise maximum of the two sequences, and the sum of the maxima must converge.)

  22. 22.

    This need not be a probability measure, but it should at least be \(\sigma\) -finite, meaning that the worlds can be partitioned into countably many sets with finite measure.

References

  1. Arrow, K. J. (1970). Social choice and individual values (2nd ed.). New Haven, CT: Yale University Press.

    Google Scholar 

  2. Christensen, D. (2007). Epistemology of disagreement: The good news. Philosophical Review, 116(2), 187–217. http://www.jstor.org/stable/20446955.

  3. Elga, A. (2007). Reflection and disagreement. Noûs, 41(3), 478–502. http://www.jstor.org/stable/4494542.

  4. Elga, A. (2010). Subjective probabilities should be sharp. Philosophers’ Imprint, 10(05). http://hdl.handle.net/2027/spo.3521354.0010.005.

  5. Field, H. (1978). A note on Jeffrey conditionalization. Philosophy of Science, 45(3), 361–367. http://www.jstor.org/stable/187023.

  6. Fitelson, B., & Jehle, D. (2009). What is the ‘equal weight view’? Episteme, 6(3), 280–293. doi:10.3366/E1742360009000719.

    Article  Google Scholar 

  7. Genest, C. (1984). A characterization theorem for externally Bayesian groups. The Annals of Statistics 12(3), 1100–1105. http://www.jstor.org/stable/2240984.

  8. Genest, C., & Wagner, C. G. (1987). Further evidence against independence preservation in expert judgement synthesis. Aequationes Mathematicae 32(1), 74–86. doi:10.1007/BF02311302.

  9. Genest, C., & Zidek, J. V. (1986). Combining probability distributions: A critique and an annotated bibliography. Statistical Science 1(1), 114–135. http://www.jstor.org/stable/2245510.

  10. Genest, C., McConway, K. J., & Schervish, M. J. (1986). Characterization of externally Bayesian pooling operators. The Annals of Statistics 14(2), 487–501. http://www.jstor.org/stable/2241231.

  11. Gilboa, I., Samet, D., & Schmeidler, D. (2004). Utilitarian aggregation of beliefs and tastes. Journal of Political Economy 112(4), 932–938. doi:10.1086/42117310.1086/421173.

  12. Halmos, P. R. (1950). Measure theory. Princeton: D. Van Nostrand.

    Book  Google Scholar 

  13. Jeffrey, R. C. (1983a). The logic of decision. Chicago: University of Chicago Press.

    Google Scholar 

  14. Jeffrey, R. C. (1983b). Bayesianism with a human face. In Testing scientific theories. Minneapolis, MN: University of Minnesota Press.

  15. Kelly, T. (2010). Peer disagreement and higher order evidence. In Social epistemology: Essential readings. Oxford: Oxford University Press.

  16. Lackey, J. (2008). What should we do when we disagree? In Oxford studies in epistemology. Oxford: Oxford University Press.

  17. Lehrer, K., & Wagner, C. G. (1983). Probability amalgamation and the independence issue: A reply to Laddaga. Synthese 55(3), 339–346. doi:10.1007/BF00485827.

  18. Levi, I. (1980). The enterprise of knowledge: An essay on knowledge, credal probability, and chance. Cambridge, MA: The MIT Press.

    Google Scholar 

  19. List, C., & Pettit, P. (2002). Aggregating sets of judgments: An impossibility result. Economics and Philosophy, 18(1), 89–110. http://eprints.lse.ac.uk/704/.

  20. Loewer, B., & Laddaga, R. (1985). Destroying the consensus. Synthese, 62(1), 79–95. http://www.jstor.org/stable/20116085.

    Article  Google Scholar 

  21. McConway, K. J. (1981). Marginalization and linear opinion pools. Journal of the American Statistical Association 76(374), 410–414. doi:10.2307/2287843.

  22. Mongin, P. (1995). Consistent Bayesian aggregation. Journal of Economic Theory 66(2), 313–351. doi:10.1006/jeth.1995.1044.

  23. Moss, S. (2011). Scoring rules and epistemic compromise. Mind, 120(480), 1053–1069. http://www.jstor.org/stable/41494776

    Article  Google Scholar 

  24. Pruss, A. (2012). Aggregating data from agents with the same evidence. Alexander Pruss’s Blog. http://alexanderpruss.blogspot.co.uk/2012/03/aggregating-data-from-agents-with-same.html.

  25. Wagner, C. G. (1982). Allocation, Lehrer models, and the consensus of probabilities. Theory and Decision 14(2), 207–220. doi:10.1007/BF00133978.

  26. Wagner, C. G. (2002). Probability kinematics and commutativity. Philosophy of Science 69(2), 266–278. http://www.jstor.org/stable/10.1086/341053.

  27. Wagner, C. G. (2010a). Jeffrey conditioning and external Bayesianity. Logic Journal of IGPL 18(2), 336–345. http://jigpal.oxfordjournals.org/content/18/2/336.short.

  28. Wagner, C. G. (2010b). Peer disagreement and independence preservation. Erkenntnis 74(2), 277–288. doi:10.1007/s10670-010-9256-9.

  29. Wilson, A. (2010). Disagreement, equal weight and commutativity. Philosophical Studies, 149(3), 321–326. http://www.jstor.org/stable/40783268.

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jeffrey Sanford Russell.

Appendix: Proof of Fact 4

Appendix: Proof of Fact 4

Let W be a countable set of worlds. In this context, a credence function is given by a function from W to \([0,1]\) that sums to one (i.e., a probability mass function). Let n be the number of individuals. Call a sequence of n credence functions admissible iff there is some world that each function gives positive probability. (This restriction goes with the idea discussed in Sect. 3 that credence one is factive.) Then the aggregation rule \(\hbox{ag}\) is a function that takes each admissible sequence to a single credence function. In this setting, Conditionalization says that for any sequence \(C = \langle C_{1},\,\ldots , C_{n}\rangle\) and set of worlds E, if \(C \mid E = \langle C_{1} \mid E,\,\ldots , C_{n} \mid E\rangle\) is defined and admissible, then \(\hbox {ag}\,(C \mid E) = \hbox {ag}\,C \mid E\). In what follows C and \(C'\) are admissible sequences of credence functions.

For a credence function \(C_{i}\), let \(C_{i}(v : w)\) stand for the ratio \(C_{i}(v)/C_{i}(w)\) (if it is defined).

  • Pointwise Ratios: For any \(v, w \in W\), if \(C_{i}(v : w)\) and \(C'_{i}(v : w)\) are defined and equal for each i, then \(\hbox {ag}\,C(v : w) = \hbox {ag}\,C'(v : w)\).

Lemma 1

Conditionalization is equivalent to Pointwise Ratios.

Proof

Let \(E = \{v, w\}\). For each i, if \(C_{i}(v : w) = C'_{i}(v : w)\) then \(C_{i} \mid E = C'_{i} \mid E\). So by Conditionalization, \(\hbox {ag}\,C \mid E = \hbox {ag}\,C' \mid E\). This implies that \(\hbox {ag}\,C(v : w) = \hbox {ag}\,C'(v : w)\) as well. So Conditionalization implies Pointwise Ratios.

Conversely, let E be any proposition such that \(C \mid E\) is defined and admissible, and let v be a world in E that each individual gives positive probability. Then for each \(w \in E, C\) and \(C \mid E\) both have the same well-defined ratios between w and v. So \(\hbox {ag}\,C\) and \(\hbox {ag}\,(C \mid E)\) also have the same ratios for each \(w \in E\). So \(\hbox {ag}\,C \mid E\) and \(\hbox {ag}\,(C \mid E)\) are proportional, and since each of them adds up to one they are identical. So Pointwise Ratios implies Conditionalization. \(\square\)

Recall that Neutrality means that \(\hbox {ag}\) commutes with permutations of \(W\), and Continuity means that \(\hbox{ag}\) is a continuous function (with respect to the product topology of \([0, 1]^{W \times n}\)).

Fact 4:

If ag obeys Conditionalization, Continuity, and Neutrality, ag is a Weighted Geometric Rule.

Proof

Pointwise Ratios and Neutrality together imply that there is some function \(F: [0, \infty )^{n} \rightarrow [0, \infty )\) such that for each pair of distinct worlds \(v\) and \(w\), if the ratios \(C_{1}(v :\,w), \ldots , C_{n}(v :\,w)\) are all defined, then

$$\hbox {ag}\,C(v : w) = F(C_{1}(v :\,w),\,\ldots ,\,C_n(v : w))$$

(Pointwise Ratios guarantees that for each v and w there is some function \(F_{v, w}\) that determines the group ratio for \(v\) and \(w\) in terms of the individual ratios, when they are defined. Neutrality guarantees that \(F_{v, w}\) is the same for each \(v\) and \(w\).) Furthermore, if \(\hbox{ag}\) is continuous then \(F\) is continuous as well.

Ratios have the following property: if \(C(u : v)\) and \(C(v : w)\) are both defined, then \(C(u : w) = C(u : v) \cdot C(v : w)\). This implies that F is multiplicative for positive arguments:

$$F(r_{1}\cdot s_{1},\,\ldots ,\,r_{n}\cdot s_{n})=F(r_{1},\,\ldots ,\,r_{n})\cdot F(s_{1},\,\ldots ,\,s_{n})$$

(where each \(r_{i}\) and \(s_{i}\) is positive). It’s helpful to map this onto a logarithmic scale: there is a continuous function \(G : \mathbb{R}^{n} \rightarrow \mathbb{R}\) such that

$$G (\log r_{1}, \ldots , \log r_{n}) = \log F(r_{1}, \ldots , r_{n})$$

(for positive \(r_{i}\)). It follows from F’s multiplicative property that G is additive:

$$G(x_{1}+y_{1},\,\ldots ,\,x_{n}+y_{n})=G(x_{1}, \ldots , x_{n})+G(y_{1}, \ldots , y_{n})$$

But any continuous additive function from \(\mathbb{R}^{n}\) to \(\mathbb{R}\) is linear. (This fact was noted by Cauchy in 1821.) So G is linear, and thus there are weights \(a_{1},\,\ldots ,\,a_{n}\), such that

$$G(x_{1}, \ldots , x_{n})=a_{1}\cdot x_{1}+\cdots +a_{n}\cdot x_{n}$$

Undoing the transformation to the logarithmic scale, then,

$$F(r)=r_{1}^{a_{1}}\cdot \cdots \cdot r_n^{a_{n}}$$

This fixes the value of F for positive ratios to be a weighted geometric mean. When \(r_{i} = 0\), continuity forces F’s value be the limit value as \(r_{i}\) approaches zero. Accordingly, for any i, if \(a_{i} > 0\), then \(F(r_{1}, \ldots , r_{n})\) must be zero when \(r_{i} = 0\), which is consistent with geometric averaging. For \(a_{i} = 0\), if we consider \(0^{0} = 1\) then again the geometric average extends F continuously to \(r_{i} = 0\). For \(a_{i} < 0\) there is no finite limit at zero, so that case is impossible for continuous F. So F—the rule for group ratios—is a weighted geometric mean with non-negative weights.

This implies that \(\hbox {ag}\) is a Weighted Geometric Rule. Let v be a world with positive individual credences \(p_{1},\,\ldots ,\,p_{n}\), and say the group credence in that world is p. Then for any other world w with individual credences \(q_{1},\,\ldots ,\,q_{n}\) the ratios \(q_{i}/p_{i}\) are defined, and the group ratio is a weighted geometric mean of those ratios. So the group credence in w is

$$p \cdot \left( \frac{q_{1}}{p_{1}}\right) ^{a_{1}} \cdot \cdots \cdot \left( \frac{q_{n}}{p_{n}}\right) ^{a_{n}}$$

which, redistributing parentheses, is just the weighted geometric mean \(q_{1}^{a_{1}}\cdot \cdots \cdot q_{n}^{a_{n}}\) multiplied by a constant normalization factor. \(\square\)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Russell, J.S., Hawthorne, J. & Buchak, L. Groupthink. Philos Stud 172, 1287–1309 (2015). https://doi.org/10.1007/s11098-014-0350-8

Download citation

Keywords

  • Credence aggregation
  • Formal epistemology
  • Social epistemology
  • Conditionalization
  • Disagreement