Optimal Decision Rules for the Discursive Dilemma

We consider the classical problem in truth-tracking judgment aggregation of a conjunctive agenda with two premisses and one conclusion. We study this problem from the point of view of finding the best decision rule according to a quantitative criterion, under very mild restrictions on the set of admissible rules. The members of the deciding committee are assumed to have a certain probability to assess correctly the truth or falsity of the premisses, and the best rule is the one that minimises a combination of the probabilities of false positives and false negatives on the conclusion.


Statement of the problem
Nowadays, the so-called doctrinal paradox is a classical problem in judgment aggregation, in which different reasonable majority-type voting rules may lead to different conclusions: A group of people must assess the simultaneous truth or not of a set of premisses, and voting first on each premiss or voting directly on the conclusion not necessarily yield the same result.A slightly different formulation of the paradox have received the name of discursive dilemma.
In practice, the situation may appear when a court is deciding if a defendant is guilty (a set of evidences are all verified), or not-guilty (at least one of the evidences is false), and this is the origin of the name (Kornhauser [20]).But obviously it is present beyond legal cases.A prize or a job position can be awarded if and only if several debatable conditions concur; or several subjective medical indicators determine the presence of an illness or the need of a treatment; and so on.As soon as three people join to make a decision on a compound question, the paradox is potentially present.
To precise, suppose there are two clauses P and Q and each member of a committee has to decide between P and its negation ¬P and between Q and its negation ¬Q; and that the final goal is to assess if C := P ∧ Q is true or its negation ¬C is true.In the court example, the jury has to decide if the defendant is guilty or not, and it is agreed beforehand that the guilty verdict is logically equivalent to the truth of both premisses P and Q.
Suppose the voters first decide by simple majority between P and ¬P , and separately between Q and ¬Q.If both P and Q get the majority, then the conclusion is C, and otherwise it is ¬C.This decision rule is called premiss-based.Suppose on the other hand that each voter decides directly on C or ¬C, and then the collective decision is taken by simple majority on these alternatives.This rule is called conclusion-based.
There are cases where the premiss-based rule leads to C, while the conclusion-based yields ¬C.For instance, for a 3-member committee, this happens when one of the members thinks both P and Q are true, the second one thinks P is true and Q is false, and the last one thinks P is false and Q is true.Sometimes it is said that the conclusion-based rule is more "conservative" than the premiss-based rule (or that the latter is more "liberal" than the former), because the positive conclusion C is frequently the "risky one".These two rules are quite natural, and both can be justified on intuitive or philosophical grounds; see for example, Mongin [32, section 2].In particular, the conclusion-based rule respects the deliberation of the individual judges; in the premiss-based rule the decision can be fully justified in legal terms.Others rules can be proposed.In [1], we introduced a new rule, which stands in some sense midway between the premiss-based and the conclusion-based rules.
In this paper we study general decision rules for the situation given.We will consider the set of all possible decision rules, subject only to a very mild rationality requirement.They will be called admissible rules.We want to study this set as a whole, and find the best rule according to some objective criterion, disregarding whether that rule can or cannot be explained on "logical" or "intuitive" grounds, it is a consequence of some political or sociological idea, or it satisfies some other desirable property.
The mentioned rationality requirement states only that if a member of the committee changes their 1 opinion on a clause in some direction, the conclusion can only eventually change in the same direction.
We take the epistemic point of view that there is an actual truth that we want to guess with the highest possible confidence.This is different from the aggregation of preferences as in elections, or in taking decisions on the course of actions, where there is not an absolute truth.
Our objective criterion is related to the minimisation of the combined chances to incur in false positives (deciding C when the reality is ¬C) and in false negatives (deciding ¬C when the reality is C).This is explained in detail in Section 3 and of course involves a mathematical (probabilistic) setting where all elements have to be precisely defined.Our point of view is thus "conclusion-centric", in the sense that we do not care about the correct guessing of the premisses.
We emphasise that the adoption of a particular criterion is a modelling choice, and it is what confers the rationale to the best rule under it.The criterion proposed here can be replaced by another one, if deemed better for the situation at hand, and the philosophy of finding the best under the chosen criterion can be applied as well.We consider here, in fact, a family of criteria, parametrised by the relative weight put on false positives and false negatives.
With this optimisation approach, we do not need to talk about majorities.The votes of the n members of the committee will be split into four slots: P ∧ Q, P ∧ ¬Q, ¬P ∧ Q, and ¬P ∧ ¬Q, and the aggregated number of votes for each possibility will be non-negative integers x, y, z, t, respectively, with x + y + z + t = n, being n the number of voters.In the case of three premisses, there will be 8 slots, and in general p premisses would give 2 p different possible votes of each member.A decision rule states, for each possible values of x, y, z, t which decision, C or ¬C, is taken.
The number of rules grows exponentially with n.The admissible rules are much less, and they can be implicitly enumerated so that all computations needed to find the optimal rule or a ranking of rules are relatively efficient.
If each committee member could infallibly guess the truth or falsity of each premiss, then the correct truth or falsity of the conclusion will be reached without difficulty.In fact, a singlemember committee would suffice.The whole point of having multi-member committees is to alleviate the possibility that the final conclusion be wrong.It is therefore quite natural to use a probabilistic model that starts with the (estimated) probability that the committee members make the correct guessing on each premiss.We call this probability their competence, and we assume that it is greater that 1  2 , and that is the same for all members of the committee and for all premisses, although this is easily relaxed, as we will see in the final section.
The collective decision guesses correctly or incorrectly with some probability that depends on the voters' competence and on the real truth value of the premisses.Only the conclusion matters, and only the premisses are voted.One may think, as pointed out by Mongin [32], that an external judge has to decide on the conclusions after the committee has sent them their individual opinions.

Related literature
Doctrinal paradox.The term doctrinal paradox appears first in the works of Kornhauser [21], and Kornhauser and Sager [22].They were interested in legal court cases, so that they spoke of issue-by-issue and case-by-case majority voting.
Pettit [38] and List and Pettit [26] formulate the problem in terms of propositional logic, and called it the discursive dilemma.The simple example of the three-member committee cited above, can be summarised in Table 1: In the Kornhauser-Sager formulation, the commit- The discursive dilemma: The collective majority voting in the three premisses is inconsistent with the doctrine C ⇔ P ∧ Q.
tee votes either on the first two propositions (premiss-based/issue-by-issue), or on the third (conclusion-based/case-by-case), and the two results are different.In the List-Pettit formulation, the committee votes on the three propositions, and this leads to a logical inconsistency.The inconsistency comes from the constraint C ⇔ P ∧ Q (the "doctrine" to which there is a previous agreement).We see that the individual members of the committee adhere to the doctrine; however the committee as a whole does not.
The advantage of the formulation in terms of propositional logic is that it can be generalised to any set of propositions, to the point that the distinction between premisses and conclusions may be unnecessary.In general, an agenda is a logically consistent set of propositions, closed under negation, on which judgments have to be made, and that can be entangled by logical constraints.In our case the agenda is P, ¬P, Q, ¬Q, C, ¬C , with the constraint C ⇔ P ∧ Q.
In this setting, the doctrinal paradox (or more properly, the discursive dilemma) reads: If all pairs of formulae in the agenda are decided by majority, the resulting set of propositions can be inconsistent.
Judgment aggregation.The body of knowledge that has been developed from List-Pettit formulation is known as Judgment Aggregation Theory (or Logical Aggregation Theory, as proposed by Mongin [32]).In a quite natural way, the backbone of the theory is formed by (im)possibility results on the existence of aggregation rules satisfying certain desirable axioms.List and Pettit [26], [27] already proved results of this kind, extended very soon by Pauly and van Hees [36], Dietrich [8], and Nehring and Puppe [34].
The aggregation problem is described in full generality for example in the preliminaries of Nehring and Pivato [33] and Lang et al. [23], and in the complete surveys by Mongin [32], List and Puppe [29], List and Polak [28], and List [25].A judgment is defined as a mapping from the agenda to the doubleton {True, False}; a feasible judgment respects moreover the underlying logical constraints of the propositions2 .The judgment aggregation problem is then defined as the construction of a feasible reasonable collective judgment from the voters' individual judgments.Formally, an aggregation rule F is a mapping that assigns to every profile (J 1 , . . ., J n ) of individual judgments J i of the n voters, a collective judgment J = F (J 1 , . . ., J n ).A feasible aggregation rule must assign a feasible judgment to any input of feasible profiles.Feasible rules trivially exist; for instance J ≡ J i for some i is such a rule (called a dictatorship, for obvious reasons).Banning dictatorships and imposing other mild desirable conditions leads very quickly to non-existence of feasible aggregation rules.The range of possible voting paradoxes is the set of non-feasible mappings.The classical doctrinal paradox case, despite its simplicity, already features one such non-feasible mapping, namely In a truth-functional agenda, the propositions are split into a set of premisses, and a set of conclusions.Assigning a Boolean value to all premisses, and applying the logical constraints, the value of all conclusions is determined.This is clearly the case in the doctrinal paradox setting, where moreover the premisses consist of mutually independent (not linked by constraints) proposition-negation pairs.The truth-functional case in general has been studied mainly in Nehring and Puppe [34], for independent as well as interdependent premisses, and in Dokow and Holzman [13] (see also Miller and Osherson [30]).
Distance-based methods and truth-tracking.From 2006 (Pigozzi [39], Dietrich and List [9]) another point of view emerged, in which specific judgment rules are proposed, and their properties studied.See Lang et al. [23] for a partial survey, and the references therein.Most of these rules can be defined as some sort of optimisation with respect to a criterion, i.e. the rule is defined as the one(s) that maximises or minimises a certain quantity, usually a distance or pseudo-distance to the individual profiles, while providing a consistent consensus judgment set.
There are two different situations to which they can be applied.Either the collective judgment set is a decision on the course of actions (as in the adoption of public policies), or there is an underlying objective truth of each proposition under scrutiny that one would like to guess (as in court cases).The latter is called truth-tracking (or epistemic) judgment aggregation, and it is where the present work belongs.
Thus the goal is to get the right values of this pre-existent "state of nature", or at least the right values of the set of conclusions in the truth functional case.In this context, the concept of competence of the voters arise naturally: how likely is that a voter guess the correct answer to an issue?And it is also natural to model this likelihood as a probability.Actually, this approach dates back to Condorcet and his celebrated Jury Theorem.
The competence as a parameter has been studied, for example, in Bovens and Rabinowicz [4] and in Grofman et al. [17] for the one-issue case.The latter extends the Condorcet theorem in several directions, particularly for the case of unequal competences among voters.
List [24] computed the probability of appearance of the doctrinal paradox, and the probability of correct truth-tracking as a function of the different states of nature, allowing for different competences in judging both premisses but the same across individuals.Fallis [14] also observed that the premiss-based rule is better or not than the conclusion-based rule depending on the competence and on the "scenario" (state of nature).
The fact that the probability to guess correctly the truth depend on the unknown true state of nature leads to a modelling choice: Either we specify an a priori probability distribution on the possible states of nature (for the doctrinal paradox, the four states P ∧ Q, P ∧ ¬Q, ¬P ∧ Q and ¬P ∧ ¬Q), or we have to resort to conservative estimations, as in classical (non-Bayesian) statistics.The main approach in this paper is the second, but most of the related literature assumes the first.Notably: The cited paper by Bovens and Rabinowicz [4] compares the premiss-based and the conclusionbased rules under the assumption of same competence for both premisses and their negations, and independence of voters, as in the present paper.They impose a Bernoulli prior on each premiss, the same for all.
Hartmann et al. [19] aims at generalising [24] and [4] with a conjunctive truth-functional agenda allowing more than two premisses.The authors propose a continuum of distance-based rules, parametrised by the weight of the conclusion relative to the premisses, and containing the premiss-and conclusion-based procedures as extreme cases.The hypotheses are essentially the same as in [4].Miller and Osherson [30] also propose a variety of distance metrics, and distinguish between "underlying metric" and "solution method".Each solution method chooses a loss function to minimise (based on the metric) and a set of eligible rules.
The point of view of Pivato [41] is that the votes are observations of the 'truth plus noise'.This allows to think of the profile of individual judgments as a statistical sample (at least under the hypothesis of the same noise distribution for all voters), and study the decision rules as statistical estimators.
Combined with a truth-functional agenda, the truth-tracking setting can still be concerned either with guessing the truth of all propositions ('getting the right answer for the right reasons') or only on the conclusion ('getting the right answer for whatever reasons').Bovens and Rabinowicz [4] discuss the merits of premiss-and conclusion-based procedures for both goals.Bozbay et al. [6] and Bozbay [5] also study both aims, for independent and for interrelated issues, respectively.The cited work by Hartmann et al. [19] is conclusion-centric ("whatever reasons"), while Pigozzi et al. [40], being conclusion-centric, applies later a procedure based on Bayesian networks to get the premisses that "interpret" the previously decided conclusions.
Distance-based methods are nothing else that the minimisation of a loss function that measures the dissatisfaction with every possible consistent outcome.Equivalently, one may maximise utility functions.Both are capable to account for the consequences of the decisions, and thus allow to set up more complete models, in line with Statistical Decision Theory (Berger [2]).Different loss functions or utilities give rise to possibly different optimal rules, and it is a modelling task to choose the right loss function for the problem at hand.
In this sense, Fallis [14] writes about the 'epistemic value', highlighting that guessing correctly a proposition may have a different value that guessing correctly its contrary; Bozbay [5] uses a simple 0-1 utility function to indicate incorrect-correct guessing (of all propositions or of the conclusions alone); Hartmann et al. [19] tries giving different utilities to false positives and false negatives on the conclusion to assess the performance of their continuum of metrics; finally, Bovens and Rabinowicz [4], in the discussion section, suggest introducing different utilities to each correct guessing to compare the premiss-based and the conclusion-based voting rules in each practical case.
Our proposal in this paper is an optimisation criterion in the truth-tracking, conclusion-centric case of the discursive dilemma.The best decision rule minimises a combination of false positives and false negatives, and any two rules can be easily compared according to this criterion.No prior on the states of nature needs to be established, although it can be easily accommodated.
In the theoretical results we assume the same competence level of all committee members and all premisses, and independence among premisses.In practice, the specific computation of the score of each rule can be done under much less assumptions.In any case, the loss function fully determines the optimal rule.
We do not consider strategic voting; we assume that everyone votes honestly each of the premisses.Strategic voting is conceivable even in the simple doctrinal paradox case: someone who honestly would vote for P and ¬Q could change to ¬P and ¬Q just to push more for the ¬C conclusion).Strategic voting is considered in Bozbay et al. [6], de Clippel and Eliatz [7], Terzopolou and Endriss [43] and Bozbay [5].

Organisation of the paper
The remainder of the paper is organised as follows: In Section 2 we study the structure of the sets of voting tables and of decision rules, which are both partially ordered sets with an order induced by the admissibility condition.
Section 3 introduces in the first part our probabilistic model, based on the probability that each committee member guesses correctly the truth value of each premiss, and the concepts of false positive and false negative when the true state of nature is unknown.In the second part, we introduce the family of optimisation criteria, parametrised by a relative weight assigned to the two errors.
Section 4 contain our main theoretical results.It turns out that it is relatively easy to determine whether a given voting table leads to the conclusion C or ¬C in the optimal rule.Moreover, this depends only on two numbers: the difference between votes to P ∧ Q and to ¬P ∧ ¬Q, and the difference between votes to P ∧ ¬Q and to ¬P ∧ Q.This simplifies the structure of the set of voting tables, and shortens the evaluation of the rules.In this section we also characterise completely the set of values of competence and weight for which the premiss-based rule is optimal.
Section 5 explains details on the actual computations, and describes the accompanying software, downloadable from https://discursive-dilemma.sourceforge.io.Finally, in Section 6 we discuss the results and propose possible extensions.Some marginal computations and checks have been left to an appendix.

The set of decision rules
A possible voting result of a committee with n members assessing on issues P and Q will be a

table, denoted
x y z t , or (x, y, z, t) to save space, with non-negative integer entries, adding up to n, representing the quantity of votes received by the options P ∧ Q, P ∧ ¬Q, ¬P ∧ Q and ¬P ∧ ¬Q, respectively.The set of all such tables will be denoted by T.
A decision rule can be thought of as a mapping T −→ {0, 1}.Tables mapped to 1 are those that entail the decision C = P ∧ Q; those mapped to 0 represent the opposite, ¬C.Sometimes we will call them positive tables and null tables, and denote by T + and T 0 the respective sets.
The decision rule can also be seen as the subset of positive tables, and we will make use of both interpretations.
There are N = n!
x!y!z!t! ways to fill a voting table, and 2 N decision rules, as many as subsets of the set of tables.This is a huge number, already for n = 3, but the set of "reasonable rules" will be much more modest.
Two tables (x, y, z, t) and (x, z, y, t) are transposed of each other.Since P and Q will play symmetric roles, it makes sense to admit only rules that assign the same decision to both tables.
Besides this symmetry, we impose another condition for the admissibility of a decision rule.Suppose that, given a positive voting table (x, y, z, t), one of the voters of ¬P changes their choice to P , or one of the voters of ¬Q changes to Q.The new table would support better the conclusion C than the older; hence it makes sense to impose that the new table be also a positive table.To formulate the condition in a mathematically practical way, let us introduce the partial order on T generated by the four relations that means, the smallest partial order ≤ that satisfies relations (1) above, for all x, y, z, t for which they make sense.A partially ordered set is also called a poset, for short.The relations (1) are called the transitive reduction of the poset (T, ≤).Posets can be represented by Hasse diagrams, which are directed graphs with the transitive reduction represented by arrows pointing in the increasing direction.The case of committee size n = 3 is depicted in Figure 1, where transposed tables have been identified; they occupy the same spot and are not comparable.When two tables T, S ∈ T, satisfy T ≤ S and T = S, we shall obviously write T < S, or S > T .
We thus arrive to the following reasonable definition of admissibility: Definition 2.1.A decision rule r : T → {0, 1} is admissible if: 1.It takes the same value on transposed tables: r(x, y, z, t) = r(x, z, y, t) .

2.
It is order-preserving on the partially ordered set (T, ≤): Example 2.2.The classical premiss-based rule r pb is defined by r pb (x, y, z, t) = 1 if and only if x + y > z + t and x + z > y + t , whereas the conclusion-based rule r cb is given by It is readily checked that both rules are admissible in the sense of Definition 2.1.In Alabert and Farré [1], another admissible rule was introduced, called path-based, and defined by r hb (x, y, z, t) = 1 if and only if x > z + t and x > y + t .
As an example of a non-admissible rule, consider declaring C true if and only if the votes for P ∧ Q are more than any other combination of premisses (i.e.x > y and x > z and x > t).We will see later that the second admissibility condition is not restrictive with respect to our optimisation criterion: given a non-order-preserving rule, there exists an order-preserving one that performs better.
The poset (T, ≤) is ranked (also called graded ), i.e. there exists a rank function ρ compatible with the order relation: It satisfies T < S ⇒ ρ(T ) < ρ(S), and if S is an immediate successor of T (there are no elements in between), then ρ(S) = ρ(T ) + 1.In the Hasse diagram, each rank can be pictured as a "level" in the graph (see again Figure 1).
To prove that (T, ≤) is a ranked poset, we use the known result that a finite poset admits a rank function if and only if all maximal chains have the same length. 3Recall that a chain is a totally ordered subset of the poset.A maximal chain is a chain with maximal cardinality.
A poset is connected if for every two elements T and S there is a finite sequence T = U 1 , . . ., U n = S of elements such that U i and U i+1 are comparable, i.e. either U i ≤ U i+1 or U i+1 ≤ U i .The poset (T, ≤) is connected, because we can transform a table into any other one by moving votes one at a time through the transitive reduction.
Proof.There is a unique minimal element, namely m = (0, 0, 0, n), and a unique maximal element M = (n, 0, 0, 0).Since (T, ≤) is connected, all maximal chains start and finish in these elements.To go from m to M , each vote must make two steps, one of them up or to the left of the table (that means, from t to y or z), and the other one to the left or up respectively (from y or z to x).These individual movements are the transitive reduction of the partial order ≤, and therefore there are no other tables in between.Since there are n votes, we need 2n steps to move all votes from the minimal to the maximal element, and in consequence any maximal chain has exactly 2n + 1 elements.
Notice that a movement towards an immediate successor imply subtracting one unit to t or adding one unit to x, but not both.Therefore ρ(T ) = x − t is a rank function for (T, ≤).
The rank function of a ranked poset is not unique, but it is completely determined by setting the rank of any element of the poset.
The set A of admissible decision rules possesses also a natural partial order: r ≤ s if r(T ) ≤ s(T ) for all tables T ∈ T. This is the usual partial order in a set of real functions on any domain.
Since the range of decision rules mappings is {0, 1}, the relation r ≤ s means that the set of positive tables relative to r is included in the set of positive tables relative to s.It can be said that r is less liberal (or more conservative) than s in the sense explained in the Introduction.In terms of the risk of opting for C when it is wrong, ≤ is the relation "to be less risky or equal to".
Example 2.4.Refer to the rules of Example 2.2.The premiss-based rule is more liberal than the path-based rule, and this one is in turn more liberal than the conclusion-based rule.In other words, we are saying that, in (A, ≤), This can be seen using the characterisations given in Example 2.2 (see [1, Proposition A.2] for the proof).To precise a little more, one can check that rules r hb and r cb coincide for committee sizes n = 3, 5, and they are different for n ≥ 7. Rule r pb is always strictly greater than r hb .
An upper set is a subset U of a poset such that x ∈ U, x < y ⇒ y ∈ U .It is immediate to see, from the second condition of admissibility, the following equivalence.
Given a poset, the family of its upper sets, with the inclusion order relation, is a complete lattice: A partially ordered set in which all subsets have a supremum (a least upper bound) and an infimum (a greatest lower bound).Applied to our case, we are saying that the union and the intersection of upper sets are upper sets or, in terms of rule mappings, that the maximum (= sum) and the minimum (= product) of admissible rules are admissible rules.
An antichain is a subset S of a poset such that any two elements of S are not comparable.Antichains and upper sets are related in the following way: The minimal elements of any upper set form and antichain; conversely, any antichain A determines the upper set The empty antichain is also considered, and corresponds in our case to the rule r ≡ 0.
For finite posets, the correspondence between antichains and upper sets is bijective.Enumerating upper sets is therefore equivalent to enumerating antichains.Even computing the number of upper sets is not easy in general.For example, in the well-known poset of the subsets of a given set of k elements, with the inclusion relation, the number of upper sets (called the Dedekind numbers, see [35]), is not known for k > 8.

Probabilistic model and optimisation criterion
We want to find the best of all admissible rules, according to some quantitative criterion, formulated in terms of a probabilistic model.In this section the criterion will be introduced, and the next one will be devoted to the characterisation of the optimal rule.

Probabilistic model
Suppose C = P ∧ Q is the true state of nature.If for some rule r and a table of votes T = (x, y, z, t), we have r(T ) = 1, we say that this is a true positive (TP).Otherwise, if r(T ) = 0, it is a false negative (FN).Similarly, if ¬C is the true state, r(T ) = 0 will be a true negative (TN) and r(T ) = 1 will be a false positive (FP).Ideally, a good decision rule should minimise somehow the occurrence of false positives and false negatives.To assess the likelihood of these occurrences we need a probabilistic model in which to evaluate the probability of appearance of FP and FN.To that end, we need an estimate of the probability that the members of the committee guess correctly the true value of the premisses P and Q.
The probability that a committee member vote the correct value of the premisses will be called its competence.We will assume that all committee members have the same competence, a number strictly between 1 2 and 1.Notice that a competence less than 1 2 does not make sense, because in that case we can reverse all opinions of the committee, and we get another committee with competence greater than 1  2 .If it were exactly 1 2 , there is a trivial solution that will be pointed out later; if it is 1, then a one-member committee is enough and they are always right.We will also assume that the committee size n is odd and two additional independence conditions.Specifically, we assume in the sequel the following hypotheses: (H1) Odd committee size: The number of voters is an odd number, n = 2m + 1, with m ≥ 1.
(H2) Equal competence: The competence θ satisfies 1 2 < θ < 1 and it is the same for all voters and for both premisses P and Q.
(H3) Mutual independence among voters: The decision of each voter does not depend on the decisions of the other voters.
(H4) Independence between P and Q: For each voter, the decision on one premiss does not influence the decision on the other.
Formally, hypotheses (H2)-(H4) can be rephrased by saying that for each voter in the committee and each premiss, there is a random variable that takes the value 1 if the voter believes the clause is true, and zero otherwise, and all these random variables are stochastically independent and identically distributed.Their specific distribution depends on the true state of nature.See Section 6 for possible relaxations of these hypotheses.
Then the probability that the votes result in a particular table T = (x, y, z, t) is, under the different states of nature, where !means the factorial of a number.
Optimal decision rules for the discursive dilemma Proof.See Proposition A.4 of [1].
Let us denote P P ∧Q the probabilities computed under the state of nature P ∧ Q.According to the proposition above, the probability of obtaining a true positive when rule r is employed is the sum of the probabilities (2) for all tables T such that r(T ) = 1: Therefore, the probability of incurring a false negative is We cannot proceed in a completely analogous way to define true negatives and false positives, because ¬(P ∧ Q) is not a state of nature, but an ensemble of three states, each of which may yield different probabilities.At this point, there are two possible modelling paths, according to the information available: Either there is no further information about the true state of nature (or we do not want to use it); or, there is enough information to postulate an "a priori" probability π on the states of nature, and we can follow a Bayesian approach.
The main line in this paper is the first path, always applicable.Let us deviate for a moment and sketch the second one, which corresponds to a situation considered, among others, in Terzopoulou and Endriss [43], Bovens and Rabinowicz [4] and Bozbay [5]: In the Bayesian approach, P P ∧Q is interpreted as a conditional probability given P ∧ Q, and analogously for the other ones, that we denote P P ∧¬Q , P ¬P ∧Q , and P ¬P ∧¬Q .Hence, the probability of a true negative in this setting will be P ¬(P ∧Q) (TN) =P P ∧¬Q (TN) • π(P ∧ ¬Q) and then the probability of a false positive is given by For example, if π is assumed to give the same probability to all three negative states, then P ¬(P ∧Q) (TN) will be the arithmetic mean of the probabilities of TN under each state.This is the chosen prior in [43]; that of [4] is different, and Bozbay [5] completely forbids the result ¬P ∧ ¬Q.In general, if the committee knows the prior, the independence in the judgments of the premisses (H4) cannot be assumed.Now, using expressions (3), ( 4) and ( 5), the probabilities of a true negative under the three negative states are and the probability of a false positive is then computed from ( 6) and (7).
After this digression, let us turn to our main setting.For the non-Bayesian situation, we can resort to the following analogy with the classical theory of Statistical Hypothesis Testing: Suppose one has to decide if there is enough evidence that a certain population parameter is equal to a value C, as provided by a sample drawn from the population.To this end, one computes how likely the observed sample could have been produced by the value of the parameter in the complement set ¬C which is "the closest" to C. If that likelihood is acceptable (by some numerical threshold), the decision is to stick to the "null" (status quo) conclusion ¬C.If it is not acceptable, C is proclaimed as the new estimated conclusion.
Translating the analogy to our case, we must ask ourselves which of the states of nature belonging to the complement set ¬(P ∧ Q) is closest to P ∧ Q. Intuitively, P ∧ ¬Q and ¬P ∧ Q are equally close, and are closer than ¬P ∧ ¬Q.This is rigorously stated in the next proposition.
Although intuitive, the rigorous proof is a little bit technical.We use a probabilistic procedure called coupling, that transforms inequalities about probabilities into inequalities about random variables.
These arguments support the definition of the "probability" of a false positive as the probability that a rule decides C when P ∧ ¬Q is the true state of nature: P P ∧¬Q (FP) := 1 − P P ∧¬Q (TN).
It can be also thought as the maximum of the probabilities of a false positive for all possible choices of a prior π on the states of nature.It is therefore a conservative estimate of the possible error, in response to the lack of information about the underlying truth.
Proposition 3.2.Under hypothesis (H1)-(H4), for any admissible decision rule r, Proof.The first equality is clear from Definition 2.1, item 1.We only need to prove the inequality on the right.Let M ¬P ∧¬Q and M P ∧¬Q be two probability measures defined on the subsets of a set Ω, and let T be a random variable T : Ω → T such that the law of T under M ¬P ∧¬Q is P ¬P ∧¬Q and the law of T under M P ∧¬Q is P P ∧¬Q .That is, using ( 5) and (3), Suppose we could define another random variable S : Ω → T such that a) T (ω) ≤ S(ω), for all ω ∈ Ω, and b) The law of S under M ¬P ∧¬Q coincides with the law of T under M P ∧¬Q .
Then, since r is order-preserving, we will have {ω : r(S(ω)) = 1} ⊇ {ω : r(T (ω)) = 1}, and the conclusion Let us prove the existence of S : Ω → T with the properties a) and b) above, and we are done: Let T 1 , . . ., T n be independent identically distributed random variables T i : Ω → T with the same law as T but for the vote of one individual.We will switch to table notation again, for clarity, in the rest of the proof.
Let S 1 , . . ., S n be another collection of independent identically distributed random variables Optimal decision rules for the discursive dilemma S i : Ω → T, defined as follows: We have clearly that T i (ω) ≤ S i (ω), for all ω ∈ Ω.
Let us compute the law of S i under M ¬P ∧¬Q , using conditional probabilities to the value of T i : We see that the law of each S i under M ¬P ∧¬Q coincides with that of T i under M P ∧¬Q .Now, we have that and that T has the law given by ( 5) under M ¬P ∧¬Q , and by (3) under M P ∧¬Q , and S has the law given by (3) under M ¬P ∧¬Q .
Hence, T and S are the random variables we were looking for, and the proof is complete.
As a corollary, since the maximum is always greater than any weighted mean in (6), we get that P P ∧¬Q {r = 1} is greater or equal than the probability of a false positive computed with any prior distribution on the set ¬(P ∧ Q).
In a completely analogous way, one can also prove that for admissible rules the probability to conclude C is greater when the true state of nature is P ∧ Q than with any other state.
We do not need any more the subindexes in the probabilities of false positives and false negatives, since P(FN) always refers to the state P ∧Q, and P(FP) refers to the state P ∧¬Q (or to the given prior on ¬(P ∧ Q) in the Bayesian case).Instead, we will subindex P by the rule employed.For reference in the sequel, we repeat here the formulae for FP and FN: For any rule r : T → {0, 1},

Optimisation criterion
We want to obtain the best decision rule, under the probabilistic model stated above and the optimisation criterion what we develop in this subsection.This criterion was introduced in [1] and is based on minimising a weighted sum of the probability to commit a false positive and the probability to commit a false negative.It can be thought as a multi-objective optimisation problem, but that point of view does not contribute any practical insight.
Any rule r : T → {0, 1} (admissible or not) has associated probabilities of producing a False Positive P r (FP) and a False Negative P r (FN) according to formulae ( 11)- (12).Despite the simplified notation, recall that these two probabilities stem from different states of nature.If both failures are considered equally harmful, it is natural to look for the admissible rule r ∈ A that minimises the sum P r (FP) + P r (FN) .
If one of them is considered worse that the other, one can take a weighted sum L w (r) := wP r (FP) + (1 − w)P r (FN) , where w is a real number, 0 < w < 1, as the loss function that to minimise.For example, if a false positive is deemed twice as harmful as a false negative, w = 2 3 is the suitable value.Note that the weight w is a modelling choice relative to each particular application, and it is supposed to be fixed in advance of the voting stage.
In Statistics, the combination (13) of probabilities of the two types of error is called the area of the triangle, a term that comes from its origin in signal processing and the graphical methodology called Receiving Operating Characteristics (ROC).We refer the reader to Fawcett [15] for a simple introduction to ROC.If r and s are two admissible rules, and r ≤ s (equivalently, U r ⊆ U s , where U r and U s are the upper sets defining r and s respectively), then obviously, from ( 11) and ( 12), P r (FP) ≤ P s (FP) P r (FN) ≥ P s (FN) .This means that r → P r (FP) and r → P r (FN) are respectively an increasing function and a decreasing function defined on the poset of admissible rules (A, ≤).Moreover the rule r ≡ 0 (always conclude ¬C) satisfies P r (FP) = 0 and P r (FN) = 1, and the rule r ≡ 1 (always conclude C) satisfies P r (FP) = 1 and P r (FN) = 0.
As we said before, the value θ = 1  2 can be excluded because it is trivial: If w < 1 2 , the decision must be always C; if w > 1/2, the decision must be ¬C; and if w = 1/2, then the problem is completely equivalent to a single coin toss.See the appendix for the details.

Main results
Suppose we have an admissible rule r, with sets T + of positive tables and T 0 of null tables (recall the definitions in Section 2).If we choose a table T ∈ T 0 and move it to T + , we are increasing the probability of a false positive and at the same time decreasing the probability of a false negative.In doing that, we also have to move its transposed table, to maintain the first condition of admissibility.This movement may result in a decrease or an increase of the loss function L w .Definition 4.1.Let r : T → {0, 1} be an admissible decision rule, with positive set T + and null set T 0 .If moving a table T and its transposed table (and no other) from T 0 to T + results in a decrease of the loss function, we will say that we have a good table.
Here "good" only means that the voting table T "should be supporting decision C".Thus, it seems that the set T + of the optimal rule must consist of the good tables and no others.However, we still have to see that the rule defined in this way is indeed admissible.
Theorem 4.2.The rule whose positive set T + consists exactly of the good tables is admissible and optimal.This is a consequence of the following two lemmas, which are interesting in their own.Lemma 4.3 characterises the good tables in terms of w and θ, and confirms that a table and its transposed are both good or both bad.Lemma 4.4 proves that the second condition of admissibility is also satisfied, in view of Proposition 2.5.
Proof.Using formulae (11) and ( 12), the change in the quantity wP(FP) + (1 − w)P(FN) when T and its transposed table are moved from the null to the positive set is given by Dropping the factorials and dividing by we see that ( 16) is negative when and this is immediately equivalent to (15).In the case y = z, the quantity ( 16) should be divided by two, but the conclusion is the same.
Lemma 4.4.The set of good tables is an upper set of (T, ≤).
Proof.It is easy to see that, for each fixed 1 2 < θ < 1, the function (17) is decreasing in (T, ≤).It is enough to check it for the pairs of the transitive reduction (1).Moving from the smallest to the greatest table of the pair, one of the terms in (17) remains unchanged whereas the other one is multiplied by θ 1−θ −2 < 1.Therefore, if a table is good, according to Lemma 4.3 a greater table in the poset (T, ≤) is also good, and the good tables indeed form an upper set.
Notice that the optimal rule according to L w among those satisfying the first admissibility condition only, automatically satisfies the second.
The next theorem determines when, under the optimal decision rule, a voting table leads to a verdict of C = P ∧ Q or the opposite, for the symmetric case w = 1/2.This is a further characterisation of the condition (15).We prove this special case because the statement and the proof are neater, and the extension to general 0 < w < 1 is straightforward, as will be seen after the theorem.
In words, the theorem says that: if votes in favour of P ∧ Q are less than those in favour of ¬P ∧ ¬Q or there is a tie, then the decision must be ¬C; otherwise, if the difference is greater than the difference in absolute value between votes for P ∧ ¬Q and votes for ¬P ∧ Q, then the decision must be C; otherwise, the decision must be C if the competence θ of the committee is below a certain threshold (which can be computed to any desired accuracy), and ¬C if it is above.
And these are all possible cases.Proof.Starting with the last claim, these are all possible cases because n = x + y + z + t odd implies that α and ρ have different parity.In particular, ρ = α and ρ = −α.
Consider the bijective increasing transformation η = θ 1−θ , which maps ( 1 2 , 1) onto (1, ∞).The left-hand side of ( 15) can then be written as a function with ρ and α integers, α ≥ 0. The restriction with ρ and α of different parity, as already noted.According to Lemma 4.3, the table T is good for values of η such that G(η) < 2, and the optimal rule r should assign it the value 1; for values of η such that G(η) > 2, the table is bad and we must have r(T ) = 0.
Clearly, G is differentiable on (1, ∞) and lim η 1 G(η) = 2, for all ρ and α.Also, the derivative of G can always be written as and we have lim η↓1 G (η) = −2ρ.Thus, G takes the value 2 at the left boundary of the domain of interest, and starts from there decreasing or increasing according to the sign of ρ.
Let us now proceed with the three cases of the statement.Please refer to Figure 2.
In other words, for a competence value θ less than θ 0 := η 0 1+η 0 , the table is good.For θ > θ 0 , the table is bad.This finishes the proof.The case for general 0 < w < 1 is very easy to explain with the help of Figure 2.For w < 1  2 , the dashed horizontal line is above level 2. All tables are good for small enough competence levels θ (the decision must always be C).The tables of type a and c are bad for θ greater than some θ 0 .The tables of type b are all good for all competence levels.For w > 1  2 , the dashed line is below level 2, and all tables are bad for low enough competence levels (the decision must always be ¬C).Tables a stay bad for all θ, and tables b turn good after some point.For tables of type c, two things may happen: Either they are always bad or, as θ increases, they have an interval of "goodness" before turning bad again.
All intersection points are easily computed to any desired precision by solving numerically for η the equation G(η) = 2(1−w) w on (1, ∞).See Section 5 for more details.Lemma 4.3 allows a notable conceptual, notational and computational simplification: Since function G in (18) only depends on ρ = x − t and α = |y − z|, the tables in T with the same ρ and α will all be good or bad, once θ and w are fixed.If, on the contrary, two given tables do not share these values, they produce two different functions G.
This allows to consider an equivalence relation in (T, ≤) that gives rise to a quotient ranked poset, reducing in this way the complexity of the Hasse diagram and the computations.Define In particular, this equivalence relation identify transposed tables.The elements of the quotient set T/∼ are classes of voting tables and can be represented by the pair (ρ, α).We can write T ∈ (ρ, α) if T is in the class represented by (ρ, α).Now define the preorder relation in T/∼ given by (ρ, α) ≤ (ρ , α ) ⇐⇒ there exist T ∈ (ρ, α) and T ∈ (ρ , α ) such that T ≤ T .
We use the same symbol '≤' for both relations in T and T/∼, since there is no possible confusion.It can be proved in general that a relation defined in this way in the quotient set is reflexive and transitive, therefore a preorder.In general it is not antisymmetric.
We shall prove that in our case the antisymmetry holds, so that we have again a partial order.To this end, we make use of the following lemma (see Hallam [18]).A proof is included in the appendix, for the reader convenience.Lemma 4.6.Let (X, ≤) be a finite poset, ∼ an equivalence relation on X, and the preorder on X/∼ defined as: x ȳ ⇔ ∃x ∈ x, ∃y ∈ ȳ : x ≤ y.Assume that if x ȳ in X/∼, then for all x ∈ x, there exists y ∈ ȳ such that x ≤ y in X.Then, (X/∼, ) is a poset.
It is not difficult to show that the hypothesis of the lemma holds in our case; see the appendix.
Since we are identifying tables in the same rank level, the resulting quotient poset is also ranked, with the same rank function ρ.
The relations (1) defining the transitive reduction in (T, ≤) translate to in the quotient poset, and it is easy to see, while proving that Lemma 4.6 is applicable to (T, ≤) (see the appendix), that ( 21) is precisely the transitive reduction in the quotient poset.This makes the Hasse diagrams like those of Figure 3 very easy to generate for any n.
The classical premiss-based rule coincides with the one formed exactly by the tables of type b (see Example 2.2).This suggests that it is possible to characterise exactly under what conditions on w and θ the premiss-based rule is the optimal one.The result is given in Theorem 4.9.It will be a consequence of the following lemma.In the sequel we will denote G ρ,α the function defined in (18).
(5, 0) ( Figure 3: Hasse diagrams for n = 3, 5, 7, in the quotient poset of (ρ, α)-tables.In boldface, tables of type b, in normal type those of type c, and greyed out those of type a.
2. The union of tables of type a and c has a unique maximal element: Table Proof.The statements are equivalent to say that G ρ,α ≤ G 1,0 for ρ > α ≥ 0, and that for ρ < α with α ≥ 0. For the first inequality, both exponents −ρ − α and η); for the second, the first exponent is positive, and the second is greater or equal to Theorem 4.9.Let 0 < w < 1 and 1 2 < θ < 1 be the given weight and competence level, and n the committee size.The premiss-based rule is optimal if and only if Proof.Denote, as before, η := θ 1−θ , and set also ξ := 2(1−w) w .In view of Lemma 4.8, the necessary and sufficient condition for the premiss-based rule to be optimal is that the point (η, ξ) lie above the curve G 1,0 and below the curve The first inequality is equivalent to θ ≥ w, and we are done.
A simpler sufficient condition, independent of the committee size, is given in the next corollary.Figure 4 illustrates both theorem and corollary.
Proof.The second inequality results from ignoring the negative exponential term in (22).Example 4.11.From the corollary, in the balanced case w = 1 2 , one can be sure that the premiss-based rule is optimal as soon as the competence level is above 2  3 .If this is not the case, other voting tables (those of type c) have to be successively added to the set of positive tables, as the competence level decreases.Table 2 shows the critical θ 0 , to four decimal places, for the tables of type c with n ≤ 7. Notice that the value of θ 0 is independent of n.One might conjecture that the tables of type c enter the optimal rule following the increasing value of −ρ + α, an among those with the same value, following the decreasing value of ρ + α.This is true up to n = 11.For n = 13 this regularity breaks down and table (3,10) enters before (1,6), at θ 0 = 0.5160.Hence, we do not find here any computational shortcut to determine the optimal rule for general n, even in the case w = 1 2 .
By contrast, the classical conclusion-based rule is never optimal: For any n, a pair of tables (x, y, z, t) and (x , y , z , t ) can be found that lead to different results according to conclusionbased, and yet they belong to the same (ρ, α) class.For instance, for n = 3, we have (2, 0, 0, 1) leading to C and (1, 1, 1, 0) leading to ¬C; but both belong to the class (1, 0) and should have the same status in the optimal rule.However, it is worth noting that the conclusion-based rule can be better than the premiss-based for certain values of w and θ.

Computations and software
The optimality condition adds some more relations to the partial order defined by the admissibility requirement.This simplifies further the computation of the optimal rule.
Proposition 5.1.If (ρ, α) is a positive table in the optimal rule, then (ρ, α−2) is also a positive table in the optimal rule, whenever these values make sense.
The first relation in the transitive reduction (21) is no longer present in the new transitive reduction, since now there is an element in between: The resulting Hasse diagram is "thinner", and the total number of upper sets is reduced.As an example, the case n = 3 is depicted in Figure 5.There are only twelve upper sets left after this simplification.The new poset is still ranked, but ρ = x − t is no longer a rank function.
This reduction is relevant if we are only interested in the optimal rule.If we want to build instead a ranking of rules, then Proposition 5.1 is not useful.
We have built a program in Python, with a graphical interface, that allows the user to input the values of n, w and θ, and produces a ranking of decision rules.It can be currently found as a public Mercurial repository in https://discursive-dilemma.sourceforge.io/,or requested directly to the authors.The program allows to specify different competence levels for the different committee members (an extension discussed in the next section), so that in fact formulae (11) and (12) are not used, but instead the probability to get a voting table (x, y, z, t) is computed taking into account all possible permutations of voters.
(1, 0) (2, 1) (3, 0) As an example of output, see Figure 6.Two rankings are produced, the first corresponding to voting tables and rules in extended form (x, y, z, t), and the second in compact form (ρ, α).They are not a direct translation of each other, since a rule that cannot be expressed in compact form (because members of the same (ρ, α) class are assigned different conclusions) may actually be better than the next rule respecting the equivalence relation.
The rules are expressed by means of the antichain that determines the upper set of positive tables.A name is printed if the rule is one of premiss-based, conclusion-based or path-based, and the value of the loss function (14) of each rule is also given.Notice that in this example the rules in positions 3 to 5 in the extended version are not expressible in compact form, but are better than the third rule in the second ranking.Of course, the optimal rule will always coincide in both rankings.
At the moment, the program only outputs up to the five best rules, but this is an arbitrary parameter that can be easily changed in the source code.Also, we have not made any special effort for efficiency.It has been conceived only as a playground and checking tool.
Concerning the computational complexity of producing the admissible decision rules, let us note first that the total number of voting tables is equal to 1 24 (2n 3 + 15n 2 + 34n + 21) in the original poset (T, ≤), after identifying transposed tables.In the quotient poset (T/∼, ≤), this number is reduced to 1 2 (n + 1)(n + 2).Since admissible rules are in bijection with upper sets, and these in turn are determined by their antichains, we first identify the latter.For this task we make use of the Python package networkx, which contains a function antichains().The generation of the corresponding upper set and the evaluation of each table contained in it is very easy.The contribution to the loss function of tables previously computed is stored to speed up the computations.The maximum cardinality of an antichain (after identifying transposed tables) is 1  8 (n + 3)(n + 1) in (T, ≤) and 1 2 (n + 1) in (T/∼, ≤).We sketch in the appendix the computation of these numbers.
Apart from having the optimal rule or a ranking of the best rules, one might be just interested in knowing which conclusion has to be assigned to a given voting table T under the optimal rule.This is very easy by asserting inequality (15), in the case of equal competences.Otherwise, the probabilities of a False Positive and a False Negative have to be computed for that table, taking into account the different competences, and then determine their contribution to the loss function.
Finally, in the equal competences case, one may like to determine, given a fixed weight w, the intervals of competence θ where r(T ) = 0 or 1 for the optimal rule r.We need to find the root or roots of the equation This is very easy numerically.The functions involved are simple to evaluate so that pure bisection, for instance, is very fast.We only need to have a bracket where the roots are guaranteed to lie.Indeed, they are readily found: Tables of type a: 2 , the unique root of ( 23) is less than the root of η −ρ+α = 2(1−w) w .Therefore, the solution to (23) will be found between 1 and Tables of type b: , the unique root of ( 23) is less than the root of 2η −ρ+α = 2(1−w) w , and it will be found between 1 and Tables of type c: (0 < ρ < α).
If w ≤ 1 2 there is a unique root, and the same bound ( 24) is valid.

Conclusions and discussion
We have studied the discursive dilemma in its simplest classical form, and proposed a method to obtain the best rule (or a ranking of rules) by minimising a loss function that combines false positives and false negatives.Actually, we have introduced a family of loss functions, parametrised by the number 0 < w < 1.
The decision rules considered satisfy very mild and reasonable conditions of symmetry and monotonicity (Definition 2.1).In fact, the second condition is not necessary a priori if one is only interested in the best rule and not in ranking rules.In that case, monotonicity appears a posteriori as a property of the optimal rule.
Generically, the optimal rule will be unique, but specific values of weight w and competence θ may lead to ties in the evaluation of the loss function, in particular in its minimum value.To make the exposition simpler, we have avoided mentioning this possibility throughout the paper.
The loss function is a modelling choice.In any real instance it must be chosen to reflect what the best rule is intended to achieve.The important point is that the optimisation setting is worth considering for problems of judgment aggregation in general.
In Alabert-Farre [1], where this point of view was introduced, some possible extensions and open problems were discussed substantially.We summarise them here: • Different competence for each voter.This is the simplest extension.If J k is the voting table consisting only on the vote of voter k, with competence level θ k , the resulting table is J 1 + • • • + J n , whose probability law can also be computed, and the probabilities of false positive and false negative will be P r (FP) = {r(x,y,z,t)=1} Our software already computes the ranking of rules in this more general situation, • Different competence for each premiss or state of nature.The competence of a voter may be in fact a vector θ = (θ P , θ ¬P , θ Q , θ ¬Q ) of competences depending on the premiss and/or the true state of nature.In List [24], the probability of appearance of the doctrinal paradox is studied also when the competence is different on P and Q.The computation of P r (FP) and P r (FN) is more involved in this case, but still feasible.
• Non-independence between voters.If the committee members do not vote independently, perhaps through a deliberation process with influential individuals, then the full joint law of the vector (J 1 , . . ., J n ) of individual voting tables is needed to compute the law of the sum J 1 + • • • + J n under the different states of nature.Boland [3] studied this situation for the voting of a single question assuming the presence of a "leader" in the committee.Other works that studied epistemic social choice with correlated voters in the last decade include Peleg and Zamir [37], Dietrich and Spiekermann [11] [12], and Pivato [42].
• Non-independence between premisses.In practical examples, the premisses P and Q can very well be interconnected, in the sense that believing that P is true or false can change the perception on the truth or falsity of Q.This may lead to a different competence in asserting Q depending on the decision on P .Then the joint law of the competences under the four states of nature are needed to complete the computations.
The extreme case where one combination of premisses is impossible it is treated in Bozbay [5], where in addition abstentions are allowed.
• More than two premisses.There is no difficulty in extending the setting to a conjunctive agenda with any number of premisses P 1 , . . ., P s .A voting table will be an element of T = {(x 1 , . . ., x 2 s ) ∈ N 2 s : Note that disjunctive agendas, in which the conclusion is true if and only if at least one premiss is true, are dual to the conjunctive case, by negation of the doctrine (see List [24], Bovens and Rabinowicz [4], or Miyashita [31]).They can be considered easily within our framework.
An extension with an obvious practical interest is allowing abstentions, or committees with an even number of members.It is clear that enforcing an opinion on all clauses of the agenda may be inconvenient or simply impossible.These so-called incomplete judgments have been considered in Gärdenfors [16], Dietrich-List [10], Terzopoulou and Endriss [43], and Bozbay [5].
It is natural to ask which desirable properties satisfies the optimal rule of a given criterion.We leave this as an open question.In relation to the classical axioms of judgment aggregation and their (im)possibility theorems (see e.g.List [25]), and since here we are centred in reaching a right conclusion for whatever reasons, collective rationality can only be achieved by assigning a value to the premisses after deciding on the conclusion (see Pigozzi et al. [40]); but then the properties of monotonicity (in the classical sense), unanimity and systematicity need not be satisfied on the whole agenda.On the other hand, the anonymity requirement is trivially met in our setting.In any case, the advantage of the optimisation model is the immediate existence of decision rules; each of the rules is evaluated through a real-valued loss function, hence at least one rule with a minimal value must exist.Distance-based methods to reach consensus share this feature.
This proof can be found in Hallam's PhD thesis [18]; we include it here for the reader's convenience, and because his statement does not correspond completely to the proof.Suppose x ȳ and ȳ x.Then x 1 ≤ y 1 for some x 1 ∈ x and y 1 ∈ ȳ.But ȳ x implies there must be some x 2 ∈ x such that y 1 ≤ x 2 .But then, there must be some y 2 ∈ ȳ such that x 2 ≤ y 2 .And so on.At some point we must have an equality of elements, since the set is finite.Then the classes x and ȳ must coincide.
If the elements of (ρ 0 , α 0 ) and (ρ 1 , α 1 ) are not related by the transitive reduction, the argument can be iterated through a chain of elements related by the transitive reduction.Note that we have also deduced en passant the transitive reduction of the quotient poset.
• Computation of the number of voting tables (Section 5): The Whitney numbers W ρ of a finite ranked poset are defined as the number of elements in rank level ρ.We compute the total number of tables in T by computing first the Whitney numbers.We assume transposed tables are identified.
For odd positive ranks ρ, the possible values of the pair (x, t) are (ρ + r, r), for r = 0, 1, . . ., n−ρ 2 .For each fixed r, the possible values of the pair (y, z), with y ≥ z, are (n − ρ − 2r − s, s), for s = 0, 1, n−ρ−2r 2 .Thus, there are n−ρ−2r 2 + 1 such pairs.adding up these quantities from r = 0 to n−ρ 2 , yields 1 8 (n − ρ + 4)(n − ρ + 2).A similar counting gives 1 8 (n − ρ + 3)(n − ρ + 1) for even non-negative ranks ρ, which is the same number as the odd rank immediately above.The case of negative ranks is deduced by symmetry.Therefore we can write A simple but tedious computation, adding up all the Whitney numbers for −n ≤ ρ ≤ n, yields the total number of tables n ρ=−n W ρ = 1 24 (2n 3 + 15n 2 + 34n + 21) .The number max ρ W ρ is the maximal cardinality of an antichain: Indeed, since there is a unique minimal and a unique maximal element in (T, ≤), any element of the poset is comparable to the minimal and maximal element of the poset, and therefore to some element of the most populated rank level.This implies that there cannot be more that max ρ W ρ elements in any antichain.Ranks −1, 0, and 1 are the most populated, and its Whitney number is 1  8 (n + 3)(n + 1).For completeness, let us just mention that the number of tables in the original poset (T, ≤), without identifying transposed tables, is 1  24 (4n 3 + 24n 2 + 44n + 24), and the most populated rank has 1 4 (n+3)(n+1) elements.In the quotient poset of (ρ, α)-tables, there are 1 2 (n+2)(n+1) tables and a maximum of 1 2 (n + 1) members in any antichain.All computations are similar to those shown here.

Figure 1 :
Figure 1: Hasse diagram of the poset (T, ≤) for committee size n = 3, and transposed tables identified.The arrows correspond to the transitive reduction.All other relations are deduced by transitivity.The rank function ρ(x, y, z, t) = x − t is also represented.

Figure 2 :
Figure 2: Different possible shapes for function G of (18) according to the three cases in the proof of Theorem 4.5.

Figure 4 :
Figure 4: Region of optimality of the premiss-based (pb) rule, in the natural coordinates (θ, w).The thinner curves correspond to n = 3, 5, 7 approaching monotonically the curve θ = 2−2w 2−w as n → ∞.If θ < w, some tables of type b must leave the set T + ; if (θ, w) falls below the lower curve, other tables must join those of type b in the set T + ; and both things may happen at the same time, for θ < 2 − √ 2.

Figure 5 :
Figure 5: Partial order in T induced by the optimality condition (see Proposition 5.1).

Figure 6 :
Figure 6: The two rankings produced by the Python code: For rules in the form(x, y, z, t) and for rules in the form (ρ, α).In this example, the input was n = 3, w = 0.5, θ = (0.6, 0.7, 0.8).

2 s
i=1 x i = n}.The concepts of admissible rule and of false positive are easily extended.