Robust Bounds on Choosing from Large Tournaments

Tournament solutions provide methods for selecting the"best"alternatives from a tournament and have found applications in a wide range of areas. Previous work has shown that several well-known tournament solutions almost never rule out any alternative in large random tournaments. Nevertheless, all analytical results thus far have assumed a rigid probabilistic model, in which either a tournament is chosen uniformly at random, or there is a linear order of alternatives and the orientation of all edges in the tournament is chosen with the same probabilities according to the linear order. In this work, we consider a significantly more general model where the orientation of different edges can be chosen with different probabilities. We show that a number of common tournament solutions, including the top cycle and the uncovered set, are still unlikely to rule out any alternative under this model. This corresponds to natural graph-theoretic conditions such as irreducibility of the tournament. In addition, we provide tight asymptotic bounds on the boundary of the probability range for which the tournament solutions select all alternatives with high probability.


Introduction
Tournaments play an important role in numerous situations as a means of representing entities and a dominance relationship between them. For instance, both the outcome of a round-robin sports competition and the majority relation of voters in an election can be represented by a tournament. A question that occurs frequently is therefore the following: Given a tournament, how can we choose the "best" alternatives in a consistent manner? This question has been addressed by a rich and beautiful literature on tournament solutions, which have found applications in areas ranging from sports competitions (Ushakov, 1976) to multi-criteria decision analysis (Arrow and Raynaud, 1986;Bouyssou, 2004) to biology (Schjelderup-Ebbe, 1922;Landau, 1953;Slater, 1961;Allesina and Levine, 2011). Over the past half century several tournament solutions have been proposed, two of the oldest and best-known of which are the top cycle (Good, 1971;Schwartz, 1972;Miller, 1977) and the uncovered set (Miller, 1980). 1 Given that the purpose of tournament solutions is to discriminate the "best" alternatives from the remaining ones, it perhaps comes as a surprise that many common tournament solutions-including the top cycle, the uncovered set, the Banks set, and the minimal covering set-select all alternatives with high probability in a large random tournament (Fey, 2008;Scott and Fey, 2012). Put differently, the aforementioned tournament solutions almost never exclude any alternative in a tournament chosen at random. Nevertheless, these results are based on the uniform random model, in which all tournaments are drawn with equal probability, or equivalently each edge is oriented in one direction or the other with equal probability independently of other edges. For a large majority of applications of tournaments, one would not expect that this assumption holds. Indeed, stronger teams are likely to beat weaker teams in a sports competition, and candidates with a large base of support have a higher chance of winning an election. Moreover, real-world tournaments often exhibit a certain degree of transitivity: If alternatives a, b, and c are such that a dominates b and b dominates c, then it is more likely that a dominates c than the other way around.
A more general model of random tournaments is the Condorcet random model, previously considered by Frank (1968), Łuczak et al. (1996), Vassilevska Williams (2010) and Kim et al. (2017). In this model, there is a linear order of alternatives, which can be interpreted as an ordering of the alternatives from strongest to weakest. For each pair of alternatives, the probability that the edge is oriented from the alternative that occurs later in the linear order to the alternative that occurs earlier in the linear order is p, independently of other pairs of alternatives. 2 Crucially, the value of p is the same for all pairs of alternatives. The Condorcet random model generalizes the uniform random model, since the latter can be obtained from the former by taking p = 1/2. Łuczak et al. (1996) showed that under the Condorcet random model, the top cycle selects all alternatives as long as p ∈ ω(1/n). The same authors show furthermore that this bound is tight, that is, the statement no longer holds if p ∈ O(1/n). 3 Although the Condorcet random model addresses the issues raised above with regard to the uniform random model, it is still rather unrealistic for two important reasons. Firstly, in tournaments in the real world, the orientation of different edges are typically determined by different probabilities. For instance, in a sports tournament the probability that a very strong team beats a very weak team is usually higher than the probability that a 1 For a thorough treatment of tournament solutions, we refer the reader to excellent surveys by Laslier (1997) and . 2 By symmetry, we may assume without loss of generality that p ≤ 1/2. 3 See, e.g., Cormen et al. (2009) for the definitions of asymptotic notations. moderately strong team beats a moderately weak team; a similar phenomenon can be observed in elections. Secondly, even though one can roughly order the alternatives in a tournament according to their strength, it is often the case that not all probabilities of the orientation of the edges respect the ordering. Indeed, this precisely corresponds to the notion of "bogey teams"-weak teams that nevertheless frequently beat certain supposedly stronger teams. Given the limitations of the uniform random model and the Condorcet random model, it is natural to ask whether previous results continue to hold under more general and realistic models of random tournaments, or whether they break down as soon as we move beyond these restricted models.
In this paper, we show that a number of tournament solutions, including the top cycle and the uncovered set, still choose all alternatives with high probability under a significantly more general model of random tournaments. Unlike the Condorcet random model, our model does not rely on an ordering of the alternatives. Instead, the orientation of each edge is determined by probabilities within the range [p, 1 − p] for some parameter p, and these probabilities are allowed to vary across edges. The only substantive assumption that we make is that the orientations of different edges are chosen independently from one another. Under this model, which is more general than both the uniform random model and the Condorcet random model, we establish in Section 3 that the top cycle almost never rules out any alternative as long as p ∈ ω(1/n), thus generalizing the result by Łuczak et al. (1996). We also show that our bound is asymptotically tight, and that analogous results hold for two other tournament solutions based on the set of Condorcet winners and losers as well. Moreover, we prove in Section 4 that the uncovered set is likely to include the whole set of alternatives when p ∈ ω( log n/n). This bound is again asymptotically tight, and the same holds for another tournament solution based on the uncovered set. Since the condition that the top cycle or the uncovered set chooses all alternatives have meaningful graph-theoretic interpretations-the top cycle is the whole set of alternatives if and only if the tournament is strongly connected, 4 and the uncovered set fails to exclude any alternative exactly when all alternatives are kings 5 -we believe that our results are of independent interest in graph theory and discrete mathematics. Furthermore, the generality of our model allows us to derive consequences in Section 5 for a different model in which tournaments are generated from random voter preferences, and we complement our theoretical results with experimental data in Section 6. 4 A strongly connected tournament is also said to be strong. Strong connectedness is equivalent to irreducibility and to the property of having a Hamiltonian cycle (Moon, 1968). 5 A king is an alternative that can reach any other alternative via a directed path of length at most two (Maurer, 1980). Therefore, all alternatives of a tournament are kings if and only if every pair of alternatives can reach each other via a directed path of length at most two. Such a tournament has been studied in graph theory and called an all-kings tournament (Reid, 1982).

Related Work
The study of the behavior of tournament solutions in large random tournaments goes back to Moon and Moser (1962), who showed that the top cycle almost never rules out any alternative in a large tournament chosen uniformly at random. In fact, they proved a stronger statement that the probability that the top cycle excludes at least one alternative is inverse exponential in the number of alternatives; the estimate was later made more precise by Moon (1968) in his seminal book on tournaments. Bell (1981) also considered the top cycle but assumed that tournaments are generated from the preferences of a large number of voters, each with a uniform random ranking over the alternatives; he likewise found that the top cycle selects all alternatives with high probability under this assumption. Fey (2008) and later Scott and Fey (2012) established results on several tournament solutions including the uncovered set, the Banks set, the Copeland set, the minimal covering set, and the bipartisan set using the uniform random model. While the uncovered set, the Banks set, and the minimal covering set are likely to include all alternatives in a large random tournament, the same event is unlikely to occur for the Copeland set. On the other hand, the bipartisan set chooses on average half of the alternatives in a random tournament of any fixed size (Fisher and Ryan, 1995); it is the unique most discriminating tournament solution satisfying standard properties proposed in the literature (Brandt et al., 2018). The discriminative power of tournament solutions has also been investigated empirically by Brandt and Seedig (2016). Building on the observation that the distributions of realworld tournaments are typically far from uniform, these authors examined the behavior of eleven common tournament solutions on tournaments generated according to stochastic preference models and empirical data. The stochastic models that they used include the impartial culture model, the Mallows mixtures model, and the Pólya-Eggenberger urn model. They reported that under these more realistic models, most tournament solutions are in fact much more discriminating than the analytical results for uniform random tournaments suggest.

Preliminaries
A tournament T consists of a set A = {a 1 , a 2 , . . . , a n } of alternatives and a dominance relation. The dominance relation is an asymmetric and connex binary relation on A represented by a directed edge between each unordered pair of distinct alternatives in A. We say that alternative a i dominates another alternative a j if there is an edge from a i to a j . An alternative is said to be a Condorcet winner if it dominates all of the remaining alternatives, and a Condorcet loser if it is dominated by all of the remaining alternatives. We extend the dominance relation to sets and say that a set A ′ ⊆ A of alternatives dominates another set A ′′ ⊆ A of alternatives disjoint from A ′ if for all a ′ ∈ A ′ and a ′′ ∈ A ′′ , a ′ dominates a ′′ . A tournament is commonly interpreted as the outcome of a round-robin sports competition and as the majority relation of an odd number of voters with linear preferences. In the former interpretation, alternative a i dominating alternative a j means that the player or team represented by a i beats the player or team represented by a j in the competition. In the latter interpretation, the same dominance relation signifies that more than half of the voters prefer a i to a j .
We are interested in tournament solutions, which are functions that map each tournament to a nonempty subset of its alternatives, usually referred to as the choice set. Two simple tournament solutions are COND, which chooses a Condorcet winner if one exists and chooses all alternatives otherwise, 6 and the set of Condorcet non-losers (CNL), which consists of all alternatives that are not Condorcet losers. Other tournament solutions considered in this paper are the following: • The top cycle (TC ) is the (unique) smallest set of alternatives such that all alternatives in the set dominate all alternatives not in the set; • The uncovered set (UC ) consists of all alternatives that can reach all other alternatives via a domination path of length at most two; 7 • The iterated uncovered set (UC ∞ ) is the result of iteratively computing the uncovered set until there is no further reduction.
The inclusions UC ∞ (T ) ⊆ UC (T ) ⊆ TC (T ) ⊆ CNL(T ) and TC (T ) ⊆ COND(T ) hold for any tournament T . Next, we describe the random models for generating tournaments that we consider in this paper. We will work with the first model in Sections 3 and 4 and the second model in Section 5.
• Model 1: For each pair of distinct alternatives a i , a j , there is an edge from a i to a j with probability p i,j and an edge from a j to a i with probability p j,i = 1 − p i,j , independently of other pairs of alternatives.
• Model 2: There is a constant number k of voters, where k is odd. For each voter v and each pair of distinct alternatives a i , a j , the voter prefers a i to a j with probability q v,i,j and prefers a j to a i with probability q v,j,i = 1 − q v,i,j , independently of other voters and other pairs of alternatives. 8 The majority relation, in which alternative a i dominates another alternative a j if and only if more than half of the voters prefer a i to a j , forms a tournament with A as its set of alternatives.
6 Note that the set of Condorcet winners is not a tournament solution because it can be empty. 7 This is known in graph theory as the set of kings (cf. Footnote 5). An alternative definition, which is also the origin of the name "uncovered set", is based on the covering relation. An alternative ai is said to cover another alternative aj if (i) ai dominates aj, and (ii) any alternative that dominates ai also dominates aj. The uncovered set corresponds to the set of alternatives that are not covered by any other alternative. 8 One way to interpret the possible intransitivity of the preferences is as a result of noise in the voters' true preferences. Laslier (2010) introduced the term Rousseauist cultures for this kind of models.
Several models for generating random tournaments considered in previous work are special cases of our models. For example, the uniform random model (Fey, 2008;Scott and Fey, 2012) corresponds to taking p i,j = 1/2 for all i, j in Model 1 or taking q v,i,j = 1/2 for all v, i, j in Model 2 with any k. The Condorcet random model (Frank, 1968;Łuczak et al., 1996;Vassilevska Williams, 2010;Kim et al., 2017) corresponds to taking p i,j = p for all i < j in Model 1, for some fixed value of p. The Condorcet random model for voters (Brandt and Seedig, 2016) corresponds to taking q v,i,j = p for all v and all i < j in Model 2, for some fixed value of p. Following standard terminology, we say that an event occurs "with high probability" or "almost surely" if the probability that the event occurs converges to 1 as n, the number of alternatives, goes to infinity. We end this section by listing some standard tools for deriving probabilistic bounds. Our first lemma is the Chernoff bound, which gives us an upper bound on the probability that a sum of independent random variables is far away from its expected value.
Lemma 1 (Chernoff bound). Let X 1 , X 2 , . . . , X r be independent random variables that take on values in the interval [0, 1], and let S = X 1 + X 2 + · · · + X r . For every δ ≥ 0, we have The next two lemmas allow us to estimate the expression 1 + x from above and below.
Lemma 2 (Bernoulli's inequality). For all real numbers r ≥ 1 and x ≥ −1, we have Lemma 3. For all real numbers x, we have 1 + x ≤ e x .

Top Cycle
In this section, we consider the top cycle. We show that when each probability p i,j is between f (n) and 1 − f (n) for some function f (n) ∈ ω (1/n), TC chooses all alternatives with high probability (Theorem 1). By using the inclusion relationships between TC , COND, and CNL, we obtain analogous statements for COND and CNL. We also show that our results are asymptotically tight-for all three tournament solutions, the statement ceases to hold if f (n) ∈ O (1/n) (Theorem 2). We begin with our main result of the section.
Theorem 1. Let f : Z + → R ≥0 be a function such that f (n) ≤ 1/2 for all n and f (n) ∈ ω (1/n). Assume that a tournament T is generated according to Model 1, and that for all i = j. Then with high probability, TC (T ) = A.
Theorem 1 generalizes a result by Łuczak et al. (1996) that establishes the claim for the case where p i,j = f (n) for all i < j (or, by symmetry, the case where p i,j = 1 − f (n) for all i < j). We remark that their proof relies crucially on the assumption that there is a linear order of alternatives and all edges are more likely to be oriented in one direction than in the other direction according to the order. Indeed, this assumption allows the authors to show that with high probability, any alternative can be reached by the strongest alternative and can reach the weakest alternative via a domination path of length at most two each. Moreover, with the assumption f (n) ∈ ω (1/n) one can show that the weakest alternative can almost surely reach the strongest alternative via a domination path of length four, thus establishing the strong connectivity of the tournament. In contrast, we do not assume that the edges in the tournament are likely to be oriented in one direction or the other. As such, we will need a completely different approach for our proof.
Before we go into the proof of Theorem 1, we first give a high-level overview. We observe that TC (T ) = A exactly when there exists a proper, nontrivial subset of alternatives B that dominates the complement set of alternatives A\B. Using the union bound, we then upper bound the probability that TC (T ) = A by the sum over all sets B of the probabilities that B dominates A\B. This sum can be written entirely in terms of the variables p i,j for i < j and is moreover linear in all of these variables, implying that its maximum is attained when all variables take on a value at one of the two boundaries of their domain. Using a number of helper lemmas (Lemmas 4, 5, and 6), we show that the sum is in fact maximized when all variables take on a value at the same boundary. This allows us to bound the sum directly by plugging in the value at a boundary and complete the proof.
In what follows, we assume that x = (x 1 , x 2 , . . . , x n ) and y = (y 1 , y 2 , . . . , y n ) are vectors of nonnegative integers with n components. We start by defining majorization, a preorder on vectors that we will use frequently in our proof.
Definition 1 (Majorization). For a vector x, let x ↓ = (x ↓ 1 , x ↓ 2 , . . . , x ↓ n ) be the vector with the same components, but sorted in descending order. Given two vectors x, y, we say that x majorizes y, and write x ≻ y, if the following two conditions are satisfied: When one vector majorizes another vector, Karamata's inequality allows us to compare the sum of an arbitrary convex function at the components of one vector to the corresponding sum of the other vector.
Lemma 4 (Karamata's inequality). Let f : Z ≥0 → R be a convex function, and let x, y be vectors with n components such that x ≻ y. Then We next show that if one vector majorizes another vector, then an analogous statement holds for the two vectors that arise from taking the sum of all subsets with any fixed number of components of the original vectors.
Definition 2. Let n be a positive integer and k ∈ {1, 2, . . . , n}. For a vector x with n components, define x (k) to be the vector with n k components consisting of all sums of k distinct components of x in nonincreasing order.
Lemma 5. If two vectors x, y with n components are such that x ≻ y, then we also have x (k) ≻ y (k) for all k = 1, 2, . . . , n.
For the sake of continuity, we leave the proof of Lemma 5 along with that of the next lemma to the appendix.
Our final lemma shows that the outdegree vector of a transitive tournament majorizes the corresponding vector of any tournament. Given a tournament T and alternative a in the tournament, denote by deg T (a) the outdegree of a in T .
With Lemmas 4, 5, and 6 in hand, we are now ready to prove Theorem 1.
Proof of Theorem 1. Let c ≥ 10 be a constant. Since f (n) = ω (1/n), there exists N ′ such that f (n) ≥ c/n for all n ≥ N ′ . Let N = max(N ′ , 4c). We will show that for n ≥ N , the probability that TC does not choose the whole set of alternatives is at most 16ce − c 2 . Since the expression converges to 0 as c approaches infinity, this will establish the desired result.
Assume that n ≥ N . Observe that TC (T ) = A exactly when there is a proper, nontrivial set of alternatives that dominate the complement set of alternatives. Hence where we use the union bound for the inequality. We will derive an upper bound for expression (1). Note that if we view the terms p i,j with i < j as variables, then the expression in linear in each variable. This implies that the maximum of expression (1) over the range p i,j ∈ [c/n, 1 − c/n] is attained when each p i,j is either c/n or 1 − c/n (but not necessarily when all p i,j are identical). We henceforth assume that for each i < j, either p i,j = c/n or p i,j = 1 − c/n. We will show that expression (1) is maximized when p i,j = 1 − c/n for all i < j (or alternatively, when p i,j = c/n for all i < j). In fact, we will show the stronger statement that for each particular value of k in the outermost summation, the expression inside the outermost summation is also maximized when p i,j = 1 − c/n for all i < j. Fix k ∈ {1, 2, . . . , n−1}. Define a tournament U on n alternatives b 1 , b 2 , . . . , b n as follows: Let W be a transitive tournament on n alternatives d 1 , d 2 , . . . , d n , where d i dominates d j for all i < j. In particular, deg W (d i ) = n − i for all i = 1, 2, . . . , n. To show that expression (1) is maximized when p i,j = 1 − c/n for all i < j, it suffices to show that expression (2) is maximized when U = W . The terms outside the summation do not depend on the tournament U that we choose, so for the purpose of maximizing expression (2) we may ignore them.
From Lemma 6, we know that Lemma 5 then implies that Using Lemma 4 with the convex function f (x) = n−c c x , we find that .
It follows that expression (2) is maximized when U = W , as claimed. We return to expression (1), which we now know is maximized when p i,j = 1 − c/n for all i < j. Substituting p i,j = 1 − c/n for all i < j, expression (1) becomes where we use Lemma 3, the assumption n ≥ 4c, and the symmetry between the terms with k = i and k = n − i for the first inequality. Observe that i:a i ∈B i − k+1 2 is always nonnegative, and is zero exactly when B = {1, 2, . . . , k}. Moreover, for any j = {1, 2, . . . , k}, the number of subsets B ⊂ A with |B| = k such that i:a i ∈B i − k+1 2 ≤ j is at most n j . Indeed, if a subset B satisfies this inequality, the n − j smallest elements of B must be 1, 2, . . . , n − j, which leaves at most n j choices for the remaining elements. Note also that |{B ⊂ A | |B| = k}| = n k ≤ n k . We have where we use the assumption c ≥ 10 for the last inequality.
In conclusion, when n ≥ N , the probability that TC (T ) = A is at most 16ce − c 2 , completing our proof.
Corollary 1. Let f : Z + → R ≥0 be a function such that f (n) ≤ 1/2 for all n and f (n) ∈ ω (1/n). Assume that a tournament T is generated according to Model 1, and that for all i = j. Then with high probability, COND(T ) = CNL(T ) = A.
Next, we show that Theorem 1 and Corollary 1 are tight in the sense that if f (n) ∈ O (1/n), the results no longer hold.
Theorem 2. Let c ≥ 0 be a constant. Assume that a tournament T is generated according to Model 1, and that p i,j ≤ c n for all i > j. Then for large enough n, with at least constant probability both TC (T ) and COND(T ) contain a single alternative. Moreover, for large enough n, with at least constant probability CNL(T ) does not contain all alternatives.
Proof. The probability that a 1 dominates all of the remaining alternatives is at least as n → ∞. When this occurs, both TC and COND only choose a 1 . An analogous argument shows that a n is dominated by all of the remaining alternatives with at least constant probability for large enough n. When this occurs, CNL chooses all alternatives except a n .
Theorems 1 and 2 and Corollary 1 allow us to obtain the following corollary on the Condorcet random model.
Corollary 2. Let f : Z + → R ≥0 be a function such that f (n) ≤ 1/2 for all n. Assume that a tournament T is generated according to Model 1, and that p i,j = f (n) for all i > j.
• If f (n) ∈ o (1/n), then with high probability, TC (T ) and COND(T ) contain a single alternative, and CNL(T ) does not contain all alternatives.
• If f (n) ≤ c/n for some constant c ≥ 0, then for large enough n, with at least constant probability TC (T ) and COND(T ) contain a single alternative. Moreover, for large enough n, with at least constant probability CNL(T ) does not contain all alternatives. Łuczak et al. (1996) also considered the case where p i,j = c/n for all i > j and showed that the probability that TC selects all alternatives converges to (1 − e −c ) 2 in this special case. Our next theorem establishes an analogous result for COND and CNL.
Theorem 3. Let c ≥ 0 be a constant. Assume that a tournament T is generated according to Model 1, and that p i,j = c n for all i > j. Then the probability that COND(T ) = A converges to 1 − e −c as n → ∞. The same statement holds for CNL.
Proof. We show the result for COND; a similar argument holds for CNL. We have

Pr[COND(T ) =
The first term converges to e −c as n → ∞. For the second term, notice that it is always at least 1. Moreover, when n ≥ (k + 1)c for some positive k > 1, the term is at most which approaches 1 for large n. Hence the second term converges to 1, and therefore the probability that COND(T ) = A converges to e −c , yielding the desired result.

Uncovered Set
In this section, we turn our focus to the uncovered set. We show that when each probability p i,j is between f (n) and 1 − f (n) for some function f (n) ≥ c log n/n with c > √ 2 a constant, UC chooses all alternatives with high probability (Theorem 4). As with TC , we also show that our result is asymptotically tight-if f (n) ≤ 0.6 log n/n, the statement no longer holds (Theorem 5). It follows that similar results hold for UC ∞ , implying that Θ( log n/n) is the threshold where the two tournament solutions go from almost always choosing all alternatives to excluding at least one alternative with high probability.
Our first result of the section shows that UC chooses the whole set of alternatives for a wide range of distributions over tournaments.
Theorem 4. Let c > √ 2 be a constant. Assume that a tournament T is generated according to Model 1, and that p i,j ∈ c log n n , 1 − c log n n for all i = j. Then with high probability, UC (T ) = A.
Proof. Choose N such that c 2 (N −2) N > 2, and let n ≥ N . Fix a pair of distinct alternatives a i , a j . We first bound the probability that a i cannot reach a j via a domination path of length at most two. For each l ∈ {i, j}, the probability that there is an edge from a i to a l and an edge from a l to a j is at least c log n/n 2 = c 2 log n/n. The probability that a i cannot reach a j via a domination path of length at most two is therefore bounded above by where we use Lemma 3 for the inequality.
Observe that UC (T ) = A exactly when any alternative can reach any other alternative via a domination path of length at most two. Using the union bound over all (ordered) pairs of distinct alternatives i, j, we find that the probability that some alternative cannot reach some other alternative via a domination path of length at most two is no more than n(n − 1)n − c 2 (n−2) n ≤ n 2− c 2 (n−2) n , which vanishes for large n.
Since the uncovered set is the finest tournament solution satisfying the axioms of Condorcet consistency, neutrality, and expansion (Moulin, 1986), Theorem 4 implies that any tournament solution that satisfies these three axioms also selects all alternatives with high probability when the tournament is generated according to the assumptions of the theorem.
Next, we show that the statement of Theorem 4 breaks down if f (n) ≤ 0.6 log n/n, thus confirming that the assumption of the theorem cannot be relaxed asymptotically.
Theorem 5. Assume that a tournament T is generated according to Model 1, and that p i,j ≤ 0.6 log n n for all i > j. Then with high probability, UC (T ) = A.
Proof. Let A 1 = {a 1 , a 2 , . . . , a ⌊n 0.49 ⌋ }, and let A 2 be the set of alternatives that a n dominates. We first prove the following claim. Claim: With high probability, the following two events occur simultaneously: (i) a n does not dominate any of the alternatives in A 1 , and (ii) |A 2 | ≤ 0.61 √ n log n. Proof of Claim: First, using Lemma 2, the probability that a n dominates at least one of the alternatives in A 1 is at most 1 − 1 − 0.6 log n n ⌊n 0.49 ⌋ ≤ 1 − 1 − 0.6 n 0.49 log n n ≤ 0.6 · √ log n n 0.01 , which converges to 0 as n → ∞. Next, for each a i ∈ A with i = 1, 2, . . . , n − 1, let X i be an indicator random variable that indicates whether a n dominates a i or not: X i takes on the value 1 if a n dominates a i and 0 otherwise. We have and E[X ′ ] = 0.6 n log n.
Moreover, observe that |A 2 | = n−1 i=1 X i . By Lemma 1, it follows that which again vanishes for large n.
Using the union bound over the two events, we have our claim. From now on, we assume that a n does not dominate any of the alternatives in A 1 and that |A 2 | ≤ 0.61 √ n log n. Under this assumption, a n can reach all of the alternatives in A 1 via a domination path of length at most two if and only if each alternative in A 1 is dominated by some alternative in A 2 . Note that the event that this holds for a particular alternative in A 1 is independent of the corresponding events for other alternatives in A 1 .

It follows that
Pr [a n can reach all a i ∈ A 1 via a domination path of length at most two] = Pr [a n can reach a fixed a i ∈ A 1 via a domination path of length two] ⌊n 0.49 ⌋ the probability that a n ∈ UC (T ) converges to 1 as n goes to infinity. This implies that with high probability, UC (T ) is not the whole set of alternatives, as desired.
Since UC (T ) = A exactly when UC ∞ (T ) = A, we immediately have the following corollary.
Corollary 3. Assume that a tournament T is generated according to Model 1.
• Let c > √ 2 be a constant. If p i,j ∈ c log n n , 1 − c log n n for all i = j, then with high probability, UC ∞ (T ) = A.
• If p i,j ≤ 0.6 log n n for all i > j, then with high probability, UC ∞ (T ) = A. Theorems 4 and 5 and Corollary 3 allow us to obtain the following corollary on the Condorcet random model.
Corollary 4. Let f : Z + → R ≥0 be a function such that f (n) ≤ 1/2 for all n. Assume that a tournament T is generated according to Model 1, and that p i,j = f (n) for all i > j.
• If f (n) ∈ ω log n/n or f (n) ≥ c log n/n for some constant c > √ 2, then with high probability, UC (T ) = UC ∞ (T ) = A.
• If f (n) ∈ o log n/n or f (n) ≤ 0.6 log n/n, then with high probability, UC (T ) = A and UC ∞ (T ) = A.

Majority Tournaments
Thus far, we have established probabilistic results for a general model in which the distribution over tournaments is defined by the probabilities that an alternative dominates another alternative in the tournament (Model 1). As we mentioned in Section 2, a common interpretation of tournaments is as the majority relation of an odd number of voters who are endowed with linear preferences over a set of alternatives. In this section, we investigate a more specific model in which the distribution over tournaments is determined by the probability that a voter prefers an alternative to another alternative (Model 2). It turns out that the generality of our results for Model 1 will allow us to derive similar results for Model 2 as consequences.
We first consider the coarser tournament solutions TC , COND, and CNL.
Theorem 6. Let f : Z + → R ≥0 be a function such that f (n) ≤ 1/2 for all n, and f (n) ∈ ω 1/n 2/(k+1) . Assume that a tournament T is generated according to Model 2, and that for all voters v and all i = j. Then with high probability, TC (T ) = COND(T ) = CNL(T ) = A.
Proof. Since TC (T ) ⊆ COND(T ) and TC (T ) ⊆ CNL(T ), it suffices to prove the statement for TC . Let c > 0 be a constant. Since f (n) ∈ ω 1/n 2/(k+1) , there exists N ′ such that f (n) ≥ c/n 2/(k+1) for all n ≥ N ′ . Let N = max(N ′ , (2c) (k+1)/2 ) and n ≥ N , and fix a pair of distinct alternatives a i , a j . Let p i,j denote the probability that a i dominates a j in T . Observe that p i,j is minimized when q v,i,j = c/n 2/(k+1) for all voters v. When q v,i,j takes on this value for all v, the probability that a i dominates a j is at least the probability that exactly (k + 1)/2 voters prefer a i to a j . The latter probability is where we use the assumption n ≥ (2c) (k+1)/2 for the first inequality and the approximation n k ≥ (n/k) k for the second inequality. Hence p i,j ≥ c/n for all n ≥ N . This implies that there exists a function f (n) ∈ ω (1/n) such that p i,j ∈ [f (n), 1 − f (n)]. Using Theorem 1, we have that TC (T ) = A with high probability, as desired.
We now consider the finer tournament solutions UC and UC ∞ .
Theorem 7. Let c > √ 2 be a constant. Assume that a tournament T is generated according to Model 2, and that q v,i,j ∈ c log n n Proof. Since UC (T ) = A implies that UC ∞ (T ) = A, it suffices to prove the statement for UC . Let n ≥ (2c) 2k+2 , and fix a pair of distinct alternatives a i , a j . Let p i,j denote the probability that a i dominates a j in T . Observe that p i,j is minimized when q v,i,j = c (log n/n) 1/(k+1) for all voters v. When q v,i,j takes on this value for all v, the probability that a i dominates a j is at least the probability that exactly (k + 1)/2 voters prefer a i to a j . The latter probability is where we use the assumption n ≥ (2c) 2k+2 for the first inequality and the approximation n k ≥ (n/k) k for the second inequality. Using Theorem 4, we have that UC (T ) = A with high probability, as desired.

Experiments
To complement our theoretical results, in this section we investigate the asymptotic behavior of random tournaments according to the Condorcet random model as well as another more realistic model that we call the gap model.

Condorcet Random Model
Starting from a set of alternatives {a 1 , a 2 , . . . , a n }, we generate random tournaments according to the Condorcet random model by inserting for each pair of alternatives a i , a j with i > j an edge from a i to a j with probability p and an edge in the reverse direction with probability 1 − p. The tournament solutions that we consider can all be computed efficiently: A simple counting algorithm suffices to compute COND, a depthfirst search algorithm computes TC in linear time, and the asymptotic running time for computing UC equals that of matrix multiplication (Hudry, 2009). In our experimental setup, we draw 10000 random tournaments of each size n ∈ {5, 10, 20, 30, . . . , 100} for each p ∈ {0.5, 0.3, 1/n, 1/n 2 , 2 log n/n, 0.6 log n/n} and check for each tournament solution S ∈ {COND, UC , TC } whether it selects all alternatives. 9 ,10 Out of that, we compute the percentage of tournaments in which all alternatives are selected. The resulting graphs are displayed in Figure 1.
For p = 0.5, which corresponds to the uniform random model, our experimental results in Figure 1(a) coincide with the main theorem of Fey (2008). The results moreover reveal that UC chooses all alternatives with high probability in tournaments with at least 50 alternatives while COND and TC already do so in much smaller tournaments. As p decreases from 0.5 toward 0, the curves of COND, TC , and UC are shifted to the right; this is to be expected since for smaller p the tournament is more skewed, making it more likely for weaker alternatives to be excluded. Nevertheless, for any fixed p the fraction of tournaments in which all alternatives are chosen approaches 1. In particular, when p = 0.3, UC almost never rules out any alternative in tournaments of size 100 or more (Figure 1(b)).
Next, we look at the regimes where the probability p goes to 0 as n approaches infinity. For the case of p = 1/n we find that, in line with Theorem 3, the probability that COND selects all alternatives converges to 1−e −1 ≈ 0.6321 (Figure 1(c)). Similarly, the probability that TC selects all alternatives converges to (1 − e −1 ) 2 ≈ 0.3996 for the same value of p, confirming a result by Łuczak et al. (1996). Letting p approach 0 even faster, we find that for p = 1/n 2 , both TC and COND are discriminative with high probability (Figure 1(d)). As 1/n 2 ∈ o (1/n), this is consistent with Corollary 2. Note that UC is discriminative for almost all tournaments for both p = 1/n and p = 1/n 2 ; indeed, this is implied by Corollary 4 since already 1/n ∈ o log n/n .
Finally, we consider the regime p = Θ log n/n , which according to Corollary 4 is the boundary between UC almost never ruling out any alternative and almost always ruling out at least one alternative. The experimental setting for p = c log n/n with c ∈ {0.6, √ 2} differs from the previous settings in that we only examined tournaments of size n ≥ 50, since for small n the expression 2 log n/n is larger than 0.5, making it unsuitable for our experiments. On the other hand, as p decreases rather slowly, we examined random tournaments up to size 1000 in order to increase the expressive power of our experiments. We find that COND and TC select all alternatives with high probability for both values of c; this is in line with Corollary 2 and the observation that c log n/n ∈ ω (1/n). On the other hand, our experiments indicate that UC returns all alternatives in almost all tournaments in the case of p = 2 log n/n (Figure 1(e)) but is discriminative in almost all tournaments when p = 0.6 log n/n (Figure 1(f)). These findings coincide with Corollary 4 and demonstrate the interesting fact that a small gap in the constant factor constitutes the threshold with regard to the discriminative power of UC .

Gap Model
As we explained in the introduction, while the Condorcet random model is commonly used in theoretical analyses, it does not properly capture tournaments in the real world since it assigns the same probability to all edges regardless of the difference in strength between the two alternatives adjacent to that edge. We next consider a different model, which we call the gap model, that takes this issue into account.
Like in the Condorcet random model, in the gap model there is a linear order of alternatives from strongest to weakest as well as a parameter p ≤ 0.5. However, the probability that a stronger alternative dominates a weaker alternative depends linearly on the size of the gap between the two alternatives in the linear order: For two alternatives a i , a j with i < j, there is an edge from a i to a j with probability 0.5 + (0.5−p)(j−i) n−1 , and an edge in the reverse direction with the remaining probability. In particular, there is an edge from a 1 to a n with probability 1 − p. We perform experiments on the gap model using the same values of p as we did for the Condorcet random model. The only exception is p = 0.5 which we replace by p = 0, the reason being that both models coincide when p = 0.5. Moreover, we double the sizes of the tournaments considered for the first four values of p in order to better illustrate the convergence behavior. The resulting graphs are displayed in Figure 2.
We now make some observations. Firstly, in Figure 2(b), we find that for p = 0.3 all three tournament solutions are unlikely to exclude any alternative in large tournaments; this is consistent with Theorems 1 and 4. The same phenomenon occurs for p = 2 log n/n (Figure 2(e)). We remark that these phenomena cannot be explained by any theoretical result prior to our work. In the remaining figures, we likewise find that in the gap model, all tournament solutions cease to be discriminative as the size of the tournament grows. This is the case even for the extreme value p = 0 (Figure 2(a)); the Condorcet random model for this value of p clearly always produces a Condorcet winner. Intuitively, the reason that all alternatives are likely to be chosen is that in the gap model, the overall difference in strength between alternatives is significantly less than in the Condorcet random model. Note that the observations for the latter values of p are not captured by our results, since our theorems require all edge probabilities to be in the range [p, 1−p] for appropriate values of p. Indeed, an intriguing direction for future work would be to generalize our results so that some edge probabilities are allowed to be outside of this range. 11

Conclusion
In this paper, we investigate the behavior of a number of tournament solutions in large random tournaments under a general probabilistic model. We establish tight asymptotic bounds on the boundary of the probability range for which each tournament solution is unlikely to exclude any alternative. In particular, we illustrate a difference between the discriminative power of the top cycle and the uncovered set; this difference is not evident in previous studies that focused on more restricted models. Indeed, while both tournament solutions include all alternatives with high probability in the uniform random model, our results suggest that the uncovered set is in fact considerably more discriminative than the top cycle. Our work leaves many interesting open questions for future study. A natural next step would be to investigate the asymptotic behavior of other tournament solutions that have been previously studied in the uniform random model-including the Banks set (Fey, 2008), the minimal covering set (Scott and Fey, 2012), and the bipartisan set (Fisher and Ryan, 1995)-using our general probabilistic model. For instance, it is conceivable that the approach used by Fey (2008) to show that the Banks set almost never rules out any alternative in the uniform random model can be extended to establish an analogous statement when each edge probability is drawn from some constant range. It is not clear, however, whether the approach would still work if we allow the range to depend on the number of alternatives in the tournament like we do in the current work.
From a broader point of view, we believe that an important direction is to apply our model to other tournament problems beyond those concerning tournament solutions, for example the problem of finding a dominating set of minimum size. It is well-known that a dominating set of size at most log 2 (n + 1) always exists and can be found using a simple greedy algorithm. While a dominating set can be as small as a singleton in tournaments that admit a Condorcet winner, Scott and Fey (2012) showed that for uniform random tournaments, a dominating set of logarithmic size is the best that one can hope for. More precisely, these authors showed that given any constant 0 < c < 1, the smallest dominating set of a tournament chosen uniformly at random contains at least c log 2 n alternatives with 11 A result of this flavor has been shown by Kim et al. (2017) in the context of single-elimination winners. high probability. Establishing a similar result in our general probabilistic model is an intriguing technical challenge that would allow us to better understand the behavior of such structures in the real world. take one that minimizes the largest index k such that x k = x 1 . Let l be the smallest index such that l i=1 x i = l i=1 y i ; the existence of l is guaranteed by condition (ii) of Definition 1. If l < n, we can apply the induction hypothesis on the first l components and the last n − l components separately to obtain y from x using a finite number of equalizing moves, which would be a contradiction. Hence we may assume that l = n. In particular, j i=1 x i ≥ j i=1 y i + 1 for all j = 1, 2, . . . , n − 1. Let m be the smallest index such that x m = x n . If x k = x m or x k = x m +1, then the only vector with nonincreasing components that is majorized by x is x itself, a contradiction. So x k ≥ x m + 2, and we may replace (x k , x m ) by (x k − 1, x m + 1) in an equalizing move. Let x ′ = (x ′ 1 , x ′ 2 , . . . , x ′ n ) be the vector resulting from this move, i.e., x ′ k = x k − 1, x ′ m = x m + 1, and x ′ i = x i for all i ∈ {k, m}. By definition of k and m, we have x ′ 1 ≥ · · · ≥ x ′ n . Moreover, n i=1 x ′ i = n i=1 y i , and for any j = 1, 2, . . . , n−1 we have j i=1 x ′ i ≥ j i=1 x i −1 ≥ j i=1 y i . This means that x ′ ≻ y. If k = 1, we have x ′ 1 − y 1 < x 1 − y 1 , which means that we can make a sequence of equalizing moves on x ′ to obtain y, a contradiction. Else, if k > 1, then we have x ′ 1 − y 1 = x 1 − y 1 and x ′ k < x ′ 1 , so we can again make a sequence of equalizing moves on x ′ to obtain y and arrive at a contradiction, completing our proof.
We now proceed to the proof of Lemma 5. Suppose that x ≻ y, and fix k ∈ {1, 2, . . . , n}. By Lemma 7, there exists a sequence of equalizing moves that takes x to y. It suffices to show that if an equalizing move takes x to x ′ , then there is a corresponding sequence of equalizing moves that takes x (k) to x ′(k) . Indeed, if this is true, then the sequence of equalizing moves that takes x to y gives rise to a corresponding sequence of equalizing moves that takes x (k) to y (k) . By Lemma 7 again, this will imply that x (k) ≻ y (k) .
Consider an equalizing move that takes x to x ′ ; assume that the move replaces the components x i > x j by x i − 1 and x j + 1, respectively. Note that the only components that change between x (k) and x ′(k) are the ones that contain exactly one of x i and x j in their sum. These components can be paired up in such a way that for each pair, one component contains x i , the other component contains x j , and both components contain exactly the same subset of the remaining x l 's with l ∈ {i, j}. For each pair, replacing x i and x j by x i − 1 and x j + 1 corresponds to an equalizing move. It follows that there exists a sequence of equalizing moves that takes x (k) to x ′(k) , as claimed.
Fix k ∈ {1, 2, . . . , n − 1}, and assume without loss of generality that deg U (b 1 ) ≥ · · · ≥ deg U (b n ). Let B = {b 1 , b 2 , . . . , b n } and B ′ = {b 1 , b 2 , . . . , b k }. The number of edges from an alternative in B ′ to another alternative in B ′ is exactly k 2 . On the other hand, the number of edges from an alternative in B ′ to an alternative in B\B ′ is at most k(n − k).