The Probability of Intransitivity in Dice and Close Elections

Intransitivity often emerges when ranking three or more alternatives. Condorcet paradox and Arrow's theorem are key examples of this phenomena in the social sciences, and non-transitive dice are a fascinating aspect of games of chance. In this paper, we study intransitivity in natural random models of dice and voting. First, we follow a recent thread of research that aims to understand intransitivity for three or more $n$-sided dice (with non-standard labelings), where the pairwise ordering is induced by the probability, relative to 1/2, that a throw from one die is higher than the other. Conrey, Gabbard, Grant, Liu and Morrison studied, via simulation, the probability of intransitivity for a number of random dice models. Their findings led to a Polymath project studying three i.i.d. random dice with i.i.d. faces drawn from the uniform distribution on $1,\ldots,n$, and conditioned on the average of faces equal to $(n+1)/2$. The Polymath project proved that the probability that three such dice are intransitive is asymptotically 1/4. We study some related models and questions. Among others, we show that if the uniform dice faces are replaced by any other continuous distribution (with some mild assumptions) and conditioned on the average of faces equal to zero, then three dice are transitive with high probability, in contrast to the unique behavior of the uniform model. We also extend our results to stationary Gaussian dice, whose faces, for example, can be the fractional Brownian increments with Hurst index $H\in(0,1)$. We study analogous questions in social choice theory, where we define a notion of almost tied elections in the standard voting model, and show that the probability of Condorcet paradox for those elections approaches 1/4, in contrast to the unconditioned case. We also explore voting models where methods other than simple majority are used for pairwise elections.


Introduction
Intransitive dice and Condorcet paradox are two examples of phenomena featuring an unexpected lack of transitivity. In this paper we present some quantitative results regarding their frequency in probabilistic settings. We introduce and present our results for dice and voting separately, making comparisons between the two settings where appropriate.

Intransitive dice
For purposes of this paper, we call an n-sided die (think of gambling dice) any vector a = (a 1 , . . . , a n ) of real numbers. The face-sum of a die a is n i=1 a i . We say that die a beats die b, denoted a ≻ b, if a uniformly random face of a has greater value than a random face of b. In other words, a ≻ b if n i,j=1 Á[a i > b j ] > n 2 /2. We call a finite set of n-sided dice intransitive if the "beats" relation on the set cannot be extended to a linear order. That is, a set of dice is intransitive if it contains a subset a (1) , . . . , a (k) such that a (1) ≻ a (2) ≻ . . . ≻ a (k) ≻ a (1) . A well-known and simple example with three sides is a = (2, 4, 9), b = (1, 6, 8) and c = (3, 5, 7). One checks that a ≻ b ≻ c ≻ a. If a set of dice forms a linear ordering, then we call it transitive. Because of ties there can be sets that are neither transitive nor intransitive, but they occur with negligible probability in the models we are studying.
Note that even though Theorem 1.1 is proved for three dice, a union bound implies that k random dice are transitive with high probability for any fixed k. The proof considers random variables W (kk ′ ) := n i,j=1 Á[k i < k ′ j ] and their normalized versions W (kk ′ ) := . Note that the dice are intransitive if and only if sgn W (ab) = sgn W (bc) = sgn W (ca) . We show that the sum W (ab) + W (bc) + W (ca) has small variance and, at the same time, each W (kk ′ ) is pretty anti-concentrated. This implies that with high probability the signs of W (kk ′ ) cannot be all equal.
To understand the differing behavior of normal and uniform dice implied by Theorem 1.1 and the Polymath result, we first note that, as shown by Polymath [Pol17a], for unconditioned dice with faces uniform in (0, 1), the face-sums determine if a beats b with high probability. Without conditioning on face-sums, the distribution of the random variable W = n i,j=1 Á[a i < b j ] is the same, regardless if the faces are Gaussian or uniform in (0, 1) (this is because the value of W is preserved by applying a strictly increasing function, in particular the Gaussian CDF Φ(·), to the faces). Furthermore, noting that, under are uniformly distributed and (globally) weakly dependent, which suggests that with high probability the expression determines the winner. Experiments for the Gaussian and exponential distributions suggest the following conjecture: Note that in the conjecture, the distribution function depends on n through the conditioning, though for large n, the marginal effect of the conditioning is small. One hopes that the proof strategy for unconditioned dice from [Pol17a] can be extended to prove Conjecture 1.2, at least for the Gaussian distribution, but we were unable to do so.

Condorcet paradox: Social chaos for close majority elections
The Condorcet paradox is a famous intransitive phenomenon in social choice theory. Consider n voters trying to decide between k alternatives. Each voter has a ranking (linear ordering) of the alternatives and we would like to aggregate the n rankings into a global one. A natural approach is as follows: Given a pair of alternatives a and b, we say that a beats b if a majority of voters put a ahead of b in their rankings (we always assume n is odd to avoid dealing with ties). Aggregating these majority elections for all K := k 2 pairs of alternatives, we obtain a tournament graph on k vertices, that is, a complete graph where each edge is directed.
If this tournament is transitive (i.e., it induces a linear ordering), or if there exists a Condorcet winner (i.e., the alternative that beats all others), we might conclude that there is a clear global winner of the election. However, the Condorcet paradox says that the pairwise rankings need not produce a Condorcet winner. For example, we might have three voters with rankings a ≻ b ≻ c, b ≻ c ≻ a and c ≻ a ≻ b, respectively. Majority aggregation results in a beating b, b beating c and c beating a. Assume a probabilistic model with n voters and k alternatives, where each voter samples one of k! rankings independently and uniformly. This is called the impartial culture assumption and is the most common model studied in social choice. Despite the example above, one might hope that under impartial culture the paradox is unlikely to arise for a large number of voters. However, it was one of the earliest results in social choice theory [Gui52,GK68] that it is not so: In particular, letting P Cond (k, n) as probability of Condorcet winner for n voters and k alternatives, and P Cond (k) := lim n→∞ P Cond (k, n) we have For k ≥ 4 there is no simple expression, but the numerical values up to k = 50 were computed by Niemi and Weisberg [NW68]; for example, P Cond (10) ≈ 51.1% and P Cond (27) ≈ 25.5%; and the asymptotic behavior is given by May [May71] as which shows that lim k→∞ P Cond (k) = 0. If one is interested in the probability of a completely transitive outcome, the best asymptotics known [Mos10] are exp(−Θ(k 5/3 )). We note that there is a vast literature on different variants and models of voting paradoxes. See Gehrlein [Geh02] for one survey of results in related settings.
Motivated by the dice models studied in [CGG + 16] and [Pol17b], we look at the probability of Condorcet paradox under impartial culture, conditioned on all pairwise elections being close to tied. More precisely, for each pair of alternatives {a, b}, define the random variable S (ab) to be the number of voters that prefer a to b, minus the number of voters preferring b to a. In other words, the sign of S (ab) determines the alternative that wins the pairwise election. Let Y (ab) := sgn(S (ab) ) and Y be a random tuple encoding the K pairwise winners via the Y (ab) , having K entries with values in {−1, 1}. Furthermore, for d ≥ 1, let E d be the event that S (ab) ≤ d for every pair {a, b}. We think of the event E d as "the elections are d-close", with d = 1 corresponding to almost perfectly tied elections.
Our main result for voting uses a multidimensional local limit theorem to show that the probability of Condorcet winner for almost tied elections goes to zero much faster than in (1.2). Actually, we prove the following stronger result. Theorem 1.3. Let n be odd, d ≥ 1 and y ∈ {−1, 1} K . Then, where α k > 0 depends only on k and o k (1) denotes a function that, for every fixed k, goes to zero as n goes to infinity. In particular, (1.5) Comparing Theorem 1.3 to intransitivity of random uniform dice conditioned on their sum, first note that for almost tied elections and k = 3, the asymptotic probability of Condorcet winner computed from (1.5) is 3/4, which is same as the probability of transitivity for dice. On the other hand, there is a difference in the transition between the transitive and chaotic regimes. Assuming dice with faces uniform in (−1, 1), the model is chaotic when conditioned on face-sums equal to zero, but, as shown by Polymath [Pol17a], it becomes transitive as soon as we condition on face-sums of absolute value at most d for d = ω(log n). However, the voting outcomes behave chaotically for d-close elections for any d = o( √ n) and transition into the "intermediate", rather than transitive, regime given by (1.1). Furthermore, (1.3) means that the tournament on k alternatives determined by Y is asymptotically random. [CGG + 16] conjectured that k random dice also form a random tournament, however [Pol17b] report experimental evidence against this conjecture. We also note that the proof of Theorem 1.3 can be modified such that its statement holds even when conditioning on only K − 1 out of K pairwise elections being d-close.
The above-mentioned work by Kalai [Kal10] calls the situation when Y is a random tournament social chaos. He considers impartial culture model (without conditioning) and an arbitrary monotone odd function f : {−1, 1} n → {−1, 1} for pairwise elections (the setting we considered so far corresponds to f = Maj n ). Under these assumptions, he proves that social chaos is equivalent to the asymptotic probability of Condorcet winner for three alternatives being equal to 3/4. [Kal10] contains another equivalent condition for social chaos, stated in terms of noise sensitivity of function f for only two alternatives. It is interesting to compare it with the reduction from three to two dice in Lemma 2.1 of [Pol17b].

Condorcet paradox: Generalizing close elections -A case study
It would be interesting to extend Theorem 1.3 to other natural pairwise comparison functions such as weighted majorities and recursive majorities, similar to the electoral college in the USA. However, in order to formulate such a result, it is first necessary to define d-close elections for an arbitrary function. We explore the difficulties around this issue in a simple example. Let us assume that there are three candidates a, b, c and a number of voters n that is divisible by three, letting m := n/3. We take f : In words, f is a two-level majority: Majority of votes of m triplets, where the vote of each triplet is decided by majority.
The function f possesses many pleasant properties: It is odd, transitive symmetric and is a polynomial threshold function of degree three. We would like to devise a natural notion of d-close elections according to f and see if it results in chaotic behavior for small d, similar to Theorem 1.3.
To start with, let w i := x 3i−2 + x 3i−1 + x 3i . In the following we will sometimes treat f as a function of w := (w 1 , . . . , w m ): f : {±1, ±3} m → {±1}, with the distribution of w induced by the distribution of x, i.e., w i = ±3 and w i = ±1 with probabilities 1/8 and 3/8, respectively. A CLT argument as in Theorem 1.3 implies chaotic elections for f if we define "d-close" as " m i=1 sgn w (kk ′ ) i ≤ d" for every pair of candidates (kk ′ ). However, this is not very satisfactory for at least two reasons. First, it does not accord well with our intuition of closeness, with the problem becoming more apparent considering analogous condition for other two-level majorities, say √ n groups of √ n voters each. Second, it does not seem to extend to other functions that do not have such an "obvious" summation built into them. Another idea is to define "d-close" the same way as in Theorem 1.3, that is as Clearly, this is not a good closeness measure for an arbitrary comparison method (e.g., weighted majority with large differences between weights), but one could argue that it is relevant at least for transitive symmetric functions. Using another CLT argument, we find that for this definition of closeness, the behavior of o( √ n)-close elections under f is not chaotic: The asymptotic Condorcet paradox probability is slightly less than 25%. Note that for three candidates, the Condorcet paradox happens if and only if f Theorem 1.4. Under the notation above and the event E d as defined in Section 1.
For comparison, without conditioning the Condorcet paradox probability is ≈ 12.5% when the elections are according to f and ≈ 8.8% according to majority.
The idea for the proof of Theorem 1.4 is to use multivariate Berry-Esseen theorem for random variables , kk ′ ∈ {ab, bc, ca} .
We are looking at sign patterns of B (kk ′ ) conditioned on small absolute values of A (kk ′ ) . A (kk ′ ) and B (kk ′ ) are not perfectly correlated and it turns out that part of (negative) correlations between B (ab) , B (bc) and B (ca) is not attributable to correlations between A (ab) , A (bc) and A (ca) . Hence, even after conditioning on small A (kk ′ ) there remains a small constant correlation between B (kk ′ ) which prevents completely chaotic behavior. Another promising definition of closeness involves the noise operator T ρ from the analysis of Boolean functions (see, e.g., [O'D14] for more details). Let ρ ∈ [−1, 1] and x ∈ {−1, 1} n . Define a probability distribution N ρ (x) over {−1, 1} n such that y 1 , . . . , y n are sampled independently with y i = −x i with probability ε := 1−ρ 2 and y i = x i otherwise. Note that E[x i y i ] = ρ, hence we say that a pair (x, y) sampled as uniform x and then y according to N ρ (x) is ρ-correlated. Given ρ and x, the noise operator T ρ is defined as For ρ ∈ (0, 1) one can think of N ρ (x) as a distribution over {−1, 1} n with the probabilities that are decreasing in the Hamming distance to x. Furthermore, for majority and This suggests that it may be fruitful to define "d-close" as " The idea becomes even more appealing when considering a Fourier-analytic Condorcet formula discovered by Kalai [Kal02]. He showed that for an odd function g : {−1, 1} n → {−1, 1} and 1/3-correlated vectors (x, y) the probability of Condorcet paradox without conditioning is equal to Another feature of the T ρ operator is that for noise sensitive functions (which [Kal10] proved to be exactly those that result in chaotic elections without conditioning) the value |T ρ f (x)| is o(1) with high probability over x. A possible interpretation of this fact is that elections according to a noise sensitive function are almost always close. Recall our "majority of triplets" function f and define the event F ρ,d as At first sight, (1.6) suggests that the event F ρ,d , with ρ = 1/3 and d = o( √ m), should cause the expectation term in (1.6) to vanish and the probability of Condorcet paradox to approach 1/4. Surprisingly, this is not the case for f : The proof of Theorem 1.5 is a complication of the proof of Theorem Then we observe that, just as for majority the value of T ρ Maj(x) is proportional to the number of ones in x minus n/2, also for f the value of T ρ f (w) is proportional to a certain linear combination of V b (w). This allows us to proceed with an identical argument as in Theorem 1.4 with appropriately redefined random variables A (kk ′ ) .
Some more recent results show that, without conditioning, majority in fact maximizes the probability of Condorcet winner among "low-influence functions" (see [MOO10] for three voters and [Mos10, IM12] for general case). This contrasts with Theorems 1.4 and 1.5 for different definitions of close elections.

Arrow's theorem for dice
To further consider the parallels between dice and social choice, we also ask if there is a dice analogue of Arrow's theorem (and its quantitative version). We obtain a rather generic statement that does not use any properties of dice and a quantitative version which is a restatement of a result on tournaments by Fox and Sudakov [FS08].
Organization of the paper The proofs of our main theorems are located in Sections 2 (Theorem 1.1), 3 (Theorem 1.3) and 4 (Theorems 1.4 and 1.5). Section 5 contains the discussion of Arrow's theorem for dice.

Gaussian dice
We are going to prove Theorem 1.1, saying that three random dice a, b and c with standard Gaussian faces conditioned on the face-sum equal to zero are transitive with high probability. One possible approach would be to generalize the proof of transitivity for unconditioned uniform dice in [Pol17a], however we follow a slightly different idea.
We rely considerably on the Hilbert space structure of joint Gaussians (see, e.g., [Jan97] for an exposition of this view). In particular, we use the fact that we can write a random Gaussian face-sum zero die a = (a 1 , . . . , a n ) as a i = x i − n i=1 x i /n, where x 1 , . . . , x n are n independent standard Gaussians. This implies that a 1 , . . . , a n are jointly Gaussian with Var[a i ] = 1 − 1/n and Cov[a i , a j ] = −1/n for i = j.
Note that E W (kk ′ ) = n 2 /2. We start with computing the covariance structure of W (kk ′ ) .
Let us look at another case: W (ab) 11 and W (ab) 12 . Take A := n 2(n−1) (b 1 − a 1 ) and B := n 2(n−1) (b 2 − a 1 ). This time A and B are joint standard Gaussians with correlation (2.5) Putting (2.4) and (2.5) together, and using bilinearity to decompose Var W (ab) into three classes of terms, We move to covariances of W (ab) and W (bc) . There are two cases to consider, analogous to the first part of the proof.
First, take W (ab) 11 and W (bc) The final case is W (ab) 11 and W (bc) (2.7) Combining (2.6) and (2.7), In particular, by Chebyshev's inequality, One hopes that, since there is only a modest amount of correlation in the random vectors a and b, the distribution of W := W (ab) converges to a standard Gaussian. We prove a weaker statement: It is at least as anti-concentrated as a Gaussian of constant variance.
Lemma 2.3. For every C ∈ R and ε > 0, we have where the constants in the O(·) notation are absolute, in particular independent of C or ε.
Proof. For ease of presentation, we assume that n is even and define m := n/2. Let us consider the die a first. Let x 1 , . . . , x n be i.i.d. standard Gaussians such that a i = x i − n i=1 x i /n. By standard Chernoff-Hoeffding bounds, all three events considered below have exponentially small probability:  the m pairs (b 1 , b 2 ), (b 3 , b 4 ), . . . , (b n−1 , b n ) become independent.
Recall the unnormalized random variable W := W (ab) and write it as by an argument very similar to the one that led to (2.8) we get that u 1 , . . . , u m are "good" with high probability in the following sense: . Recall the assumptions we made on a in (2.8). We consider Y k as a function of s k and make some ad-hoc computations concerning Y k (s k ) and Φ(·): This implies that any such Y k has (conditional) variance Var[Y k ] ≥ Ω(n 2 ).
Putting everything together, for every good a and u 1 , . . . , u m , we get independent random variables Y 1 , . . . , Y m such that σ 2 := Var [ m k=1 Y k ] ≥ Ω(n 3 ); the Ω(·) does not depend on a or u 1 , . . . , u m . Furthermore, clearly, E |Y i − E Y i | 3 ≤ 8n 3 . Let N be a standard Gaussian independent of everything else, C ∈ R and ε > 0. The Berry-Esseen theorem for non-identically distributed variables gives (2.10) Recall that W = W −n 2 /2 √ αn 3 and observe that σ √ αn 3 = O(1). Continuing (2.10), we obtain, still conditioned on good a and u 1 , . . . , u m , where all constants in the O(·) notation are absolute. Finally, averaging over all a and u 1 , . . . , u m and applying union bound over those that are not good, we get the desired conclusion: First, by union bound, for δ > 0 (2.11) We set δ := 1 n 1/3 and bound both right-hand side terms of (2.11). By Lemma 2.2, and by Lemma 2.3,

Condorcet paradox for close elections: Majority
This section contains the proof of Theorem 1.3.

Notation
We start with recalling and extending the model and notation.
There are n voters (where n is odd) and each of them independently chooses one of k! rankings of the alternatives uniformly at random. For voter i, such a random ranking gives rise to a random tuple pairwise choices (according to some fixed ordering of pairs). We call each of k! tuples in the support of x i transitive. Any other tuple is intransitive. We say that a tuple has a Condorcet winner if it has an alternative that beats everyone else.
We denote aggregation over voters by boldface. Therefore, we write x = (x 1 , . . . x n ) for the random vector of voter preferences (where each element is itself a random tuple of length K).
Given voter preferences, we say that the voting outcome is intransitive if the aggregated tuple Y is intransitive. Similarly, we say that there is a Condorcet winner if tuple Y has a Condorcet winner.
We are interested in situations where elections are "almost tied" or, more precisely, "d-close" for d ≥ 1. Specifically, we define E d to be the event where S ∞ ≤ d, i.e., |S (j) | is at most d for every j ∈ [K].

Local CLT
We use a theorem and some definitions from the textbook on random walks by Spitzer [Spi76]. In accordance with the book, we define: Definition 3.1. A k-dimensional random walk (X i ) i∈N is a Markov chain over Z k with X 0 = 0 k and a distribution of one step Z i+1 := X i+1 − X i that does not depend on i.
), note that (S i ) i∈{0,...,n} is a random walk over Z K and that we want to compute È(sgn(S n ) = y|E d ), for y ∈ {−1, 1} K . There is one technicality we need to address to apply a local CLT: Since the steps of our random walk are in {−1, 1} K , the values of (S i ) lie on a proper sublattice of Z K , namely, S (j) i always has the same parity as i. To deal with this, we define T (j) i := (S (j) 2i+1 − 1)/2. Note that (T i ) is still a random walk over Z K , with one catch: The starting point of T 0 is not necessarily the origin, but rather one of k! points in {−1, 0} K corresponding to a transitive tuple picked by the first voter.
Before we state the local CLT, we need another definition: Spi76], D1 in Section 5). A random walk over Z K is strongly aperiodic if for every t ∈ Z K , the subgroup of Z K generated by the points that can be reached from t in one step is equal to Z K . Now we are ready to state the theorem: Theorem 3.3 (Local CLT, Remark after P9 in Section 7 of [Spi76]). Let (T i ) i∈N be a strongly aperiodic random walk over Z K , starting at origin and with a single step Z, i.e., where the o(1) function depends on n, but not on t.
Our main lemma states that the joint distribution of T n conditioned on T n ∞ being small is roughly uniform.
Lemma 3.4. For the random walk (T i ) defined above and t ∈ Z K , d ≥ 1 such that t ∞ ≤ d, there are some α k , β k > 0 such that Proof. We first deal with the technicality that we mentioned before: The starting point T 0 of the random walk is itself a random variable. In the proof below we proceed conditioning on T 0 = 0 K . After reading the proof it should be clear how to modify it for other starting points in {−1, 0} K . (3.1) is obtained from those conditional results by triangle inequality. We need to check that the random walk (T i ) satisfies hypothesis of Theorem 3.3. First, note that the "step" random variable Z for (T i ) has the same distribution as (X 1 + X 2 )/2, i.e., two steps of our original random process.
To show that (T i ) is strongly aperiodic, let (e (1) , . . . , e (K) ) be the standard basis of Z K . Note that it is enough to show that for each z ∈ Z K , all of z, z + e (1) , . . . , z + e (K) are reachable from z in one step. But this is so: • It is possible to stay at z by choosing a permutation (ranking) τ for X 1 and then its reverse τ R for X 2 .
• We explain how one can move from z to z + e (j) on an example and hope it is clear how to generalize it. For k = 5 and e (j) corresponding to the b vs. d comparison, one can choose a ranking b > d > a > c > e for X 1 followed by e > c > a > b > d for X 2 .
Since Theorem 3.3 applies, we have which can be rewritten as Finally we observe that t = dt ′ for some t ′ with t ′ ∞ ≤ 1, so we have as we needed.

Proof of Theorem 1.3
Recall that we want to prove (1.3), that is After we have (1.3), the bounds (1.4) and (1.5) easily follow by triangle inequality. For y ∈ {−1, 1} K , let S y := s ∈ (2Z + 1) K : Furthermore, note that |S y | = |S y ′ | for every y, y ′ . Set M := |S y | as the common cardinality of the S y sets.
First, we use Corollary 3.5 to show that the probability È[Y = y | E d ] must be close to q : where α k is the constant from Corollary 3.5: The value of q depends on k, n and d, but not on y. The implication is that the conditional probabilities must be almost equal for every pair y, y ′ : But this is all we need, since

Remark 3.6. A similar bound with an explicit
(implying chaotic behavior for n 1/2−1/K ≪ d ≪ n 1/2 ) can be achieved using a multidimensional Berry-Esseen theorem instead of the local CLT.
Remark 3.7. As we mentioned in Section 1.2, the proof of Theorem 1.3 can be modified to give a similar bound The reason for this is that if we remove conditioning from just one S (a 0 b 0 ) , there are still no covariance factors in the CLT computation that would steer the distribution of Y away from uniform.

Condorcet paradox for close elections: Majority of triplets
Recall that we are considering odd n = 3m voters, alternatives a, b, c and random variables x This section contains proofs of non-chaotic behavior of f under certain conditionings. Section 4.1 contains proof of Theorem 1.4, dealing with conditioning on small In Section 4.2 we prove Theorem 1.5, which considers conditioning on small T ρ f x (kk ′ ) .

Proof of Theorem 1.4
For i ∈ [m], we take random tuple . Note that Z 1 , . . . , Z m are i.i.d. Let us compute the first two moments of a single-coordinate distribution Z = (A (ab) , A (bc) , A (ca) , B (ab) , B (bc) , B (ca) ). For this keep in mind that Cov x = −1/3 and refer to Table 1 for joint distribution of w (kk ′ ) and w (k ′ k ′′ ) : N (kk ′ ) be joint standard Gaussians with the same covariance structure as A (kk ′ ) and B (kk ′ ) respectively. After checking that our six by six covariance matrix is not singular, by multidimensional Berry-Esseen theorem (see the statement e.g., in [Ben05]), we can move to the Gaussian space: Using the covariance structure of M (kk ′ ) and N (kk ′ ) and the geometry of joint Gaussians, we can conclude that each N (kk ′ ) can be written as where the variables R (kk ′ ) are standard Gaussians independent of the M (kk ′ ) such that Cov R (kk ′ ) , R (k ′ k ′′ ) = −1/27. To continue the computation in (4.1), first note that by considering the probability density function of M (kk ′ ) , we have Fix some values of M (kk ′ ) such that M (kk ′ ) ∞ ≤ 1 √ 3ln n . From (4.2) and evaluating the probability È R (ab) ≥ 0 ∧ R (bc) ≥ 0 ∧ R (ca) ≥ 0 in a computer algebra system, we see that after this conditioning Putting together (4.1), (4.2) and (4.3), we can finally conclude that, for n big enough, letting C :=

Proof of Theorem 1.5
The proof of Theorem 1.5 is an unpleasant complication of the proof of Theorem 1.4, which is a recommended preliminary reading. In particular, we will use the notation that was developed there. From now on the constants in the O(·) notation are allowed to depend on ρ. Recall that for x ∈ {−1, 1} n and w ∈ {±3, ±1} m we have defined Note that T ρ f (w) = 2 (È[f (z) = 1] − 1/2), where z is generated from y, which in turn is generated according to N ρ (x). Therefore, in light of (4.5) and (4.6) we have where the sum under the probability sign is over four independent random variables with binomial distributions. It should be possible to apply a CLT argument to conclude that, for most values of W b (w), the value of T ρ f (w) is proportional to where q 3 := p 3 − 1/2 and q 1 := p 1 − 1/2. We will state a precise lemma now and continue with the proof, proving the lemma at the end: . Let Take C := π 2 σ and define events Let ∆ stand for a symmetric difference of events. Then, Assuming Lemma 4.1 we can continue along the lines of the proof of Theorem 1.4. Same as there, let B . The random variables Z 1 , . . . , Z m are i.i.d. and for CLT purposes we can compute (again Table 1 is helpful) the six by six covariance matrix Q of the distribution of Z := Z 1 : . Applying Lemma 4.1 and multidimensional Berry-Esseen theorem (using computer algebra system to check that the covariance matrix is invertible for every ρ ∈ (0, 1)), . (4.10) Computations in a computer algebra system lead to expressing N (kk ′ ) as a linear combination in the following way: where γ > 0, the random tuples M (kk ′ ) are independent of each other and each R (kk ′ ) is a standard Gaussian. Some more computation shows that (4.12) Interestingly, the bound in (4.12) is independent of ρ and approaches −1/27 as ρ goes to zero. (4.11) and (4.12) imply that after conditioning on fixed values of M (kk ′ ) such that ≈ 11.6%. (4.13) At the same time, (4.14) and (4.10), (4.13) and (4.14) lead to as we wanted. It remains to prove Lemma 4.1.
Proof of Lemma 4.1. Recall the definitions of W b (w) and V b (w). We begin with estimating T ρ f (w) for a fixed w. In the following we will sometimes drop dependence on w (writing, e.g., ) in the interest of clarity. Recall equation (4.7) and let Z := m i=1 Z i be the sum of m independent random variables arising out of the four binomial distributions featured there. We have: for t := t(w) := . Since the random variables Z i are bounded, we can apply Berry-Esseen theorem and get (using Φ(x) = 1/2 + 1/2 erf (x/ √ 2)) From now on we consider a random election with random vote vectors x (ab) , x (bc) , x (ca) that induce random vectors w (ab) , w (bc) , w (ca) . First, consider the marginal distribution of w. Since t(w) can be written as a sum of m i.i.d. random variables σ 2 mt(w) = m i=1 t i (w i ) with E[t i ] = 0 and |t i | ≤ 1, a standard concentration bound gives (4.16) We turn to estimating the symmetric difference We will use union bound over a small number (twelve) cases and show that each of them has probability O(ln −5 m). We proceed with two examples, noting that the rest are proved by symmetric versions of the same argument. First, due to (4.15) and the Taylor expansion as well as symmetric versions of (4.17) for reverse inequality and ±1/ ln m. We use (4.17) and (4.16) to estimate the first example coming from È[G 1 ∧ ¬G 2 ]: Using Berry-Esseen theorem as in the proof of Theorem 1.5 we get jointly normal centered random variables M (ab) , M (bc) , M (ca) with covariances given by (4.8) and (4.9), for which we know that Finally, the second example stemming from È[¬G 1 ∧ G 2 ] is bounded in a similar manner:

Arrow's theorem for dice
Arguably the most famous result in social choice theory is Arrow's impossibility theorem [Arr50,Arr63]. Intuitively, it states that the only reasonable voting systems based on pairwise comparisons that never produce a Condorcet paradox are "dictators", i.e., functions whose value depend only on a single voter. There are also quantitative versions, proved by Kalai [Kal02] for balanced function and by Mossel [Mos12] for general functions (with tighter bounds obtained by Keller [Kel12]). For simplicity we consider three alternatives and the impartial culture model. Then, quantitative Arrow's theorem says that a reasonable pairwise comparison function f that is ε-far from every dictator (in the sense of normalized Hamming distance), must be such that the probability of Condorcet paradox is at least Ω(ε 3 ).
There is an analogous question about transitive dice: What are the methods for pairwise comparisons of k dice that always produce a linear order? In particular, we know that comparing two dice a and b by using the "beats" relation is not one of them.
We restrict ourselves to k = 3. Assume that we look at dice with n sides labeled with  distinct dice a, b, c such that f (a, b) = f (b, c) = f (c, a).
A little thought reveals that the answer is somewhat trivial. Let O be a linear order on D m,n . We think of O as an injective function O : D m,n → R. If we define f as On the other hand, every transitive f must be of this form. To see this, consider a directed graph with vertex set D m,n where there is an edge from a to b if and only if f (a, b) = −1. This graph is a tournament and transitivity of f means that it does not contain a directed triangle. But a triangle-free tournament does not contain a directed cycle and, therefore, induces a linear order on its ground set.
We can extend this reasoning to a quantitative result. It seems easiest to assume a model where a set of three dice is sampled u.a.r. from D m,n .
There is a result about tournaments due to Fox and Sudakov [FS08]. A tournament on n vertices is called ε-far from transitive if at least εn 2 of its edges must be reversed to obtain a transitive tournament.
Theorem 5.1 ( [FS08]). There exists c > 0 such that if a tournament on n vertices is ε-far from transitive, then it contains at least cε 2 n 3 directed triangles.
Theorem 5.1 can be restated as a quantitative Arrow-like statement for dice.
Corollary 5.2. There exists c > 0 such that if a comparison function f on D m,n with m, n > 1 is ε-far from transitive, then the probability that a random triple of dice is intransitive is at least cε 2 .

Acknowledgement
We thank Timothy Gowers for helpful discussions of [Pol17b] and Kathryn Mann for asking if there is an "Arrow's theorem" for dice.