Multiwinner analogues of the plurality rule: axiomatic and algorithmic perspectives

We characterize the class of committee scoring rules that satisfy the fixed-majority criterion. We argue that rules in this class are multiwinner analogues of the single-winner Plurality rule, which is uniquely characterized as the only single-winner scoring rule that satisfies the simple majority criterion. We define top-k-counting committee scoring rules and show that the fixed-majority consistent rules are a subclass of the top-k-counting rules. We give necessary and sufficient conditions for a top-k-counting rule to satisfy the fixed-majority criterion. We show that, for many top-k-counting rules, the complexity of winner determination is high (formally, we show that the problem of deciding if there exists a committee with at least a given score is NP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm {NP}}$$\end{document}-hard), but we also show examples of rules with polynomial-time winner determination procedures. For some of the computationally hard rules, we provide either exact FPT algorithms or approximate polynomial-time algorithms.


Introduction
A multiwinner voting rule is a formal procedure for selecting a subset of predetermined size from the available candidates in accord with the preferences of an electorate (such a subset of candidates is usually referred to as a committee). Parliamentary elections constitute one of the most classic examples where multiwinner rules are regularly used. For country-wide elections, societies typically use the district-based First-Past-the-Post rule, or a party-list system, or some mixture of the two [nonetheless, some countries use other rules for this purpose, such as SNTV or STV (Lijphart and Aitkin 1994)]. In smaller-scale elections, where it is possible for the voters to rank all the candidates, many other rules become available; for example, the k-Borda rule (Debord 1992) selects committees where each member receives broad support from the electorate, the Chamberlin-Courant rule (Chamberlin and Courant 1983) finds committees with diverse membership, and the Monroe rule (Monroe 1995) is designed to achieve proportional representation.
Apart from political elections, multiwinner rules are useful for many other purposes: to shortlist candidates for a job interview (Barberà and Coelho 2008;, to determine the locations of public facilities (Zanjirani and Hekmatfar 2009), in a wide range of scenarios where resources need to be selected and assigned to the agents for their (shared) use , in segmentation problems (Kleinberg et al. 2004), or even in search strategies of genetic algorithms Sawicki et al. 2017). In business applications, company strategists deciding which sets of products to advertise on the front pages of their websites implicitly use multiwinner elections to make their choices Boutilier 2011, 2015). Since these tasks are very different in spirit, one may presume that not all rules are equally suitable for all scenarios. This makes the question of comparing different rules, and of understanding their nature and their shortcomings (including their computational difficulty), very relevant.
One approach to advance our understanding of the nature of multiwinner rules is to view them as extensions of certain well-understood single-winner ones. For example, Single Non-Transferable Vote (SNTV) can clearly be viewed as an extension of Plurality, because it selects the k candidates with the k highest numbers of the firstplace votes. However, this is not the only point of view that one can take in generalizing Plurality. For instance, one could argue that a voter's most preferred committee consists exactly of those k candidates that this voter ranks in top k positions, and, so, if under the Plurality rule a voter gives a point only to his or her most preferred candidate, then under a multiwinner Plurality a voter should give points only to those candidates that belong to his or her most preferred committee. In fact, this is exactly how the Bloc rule works, and one can argue that Bloc is an extension of Plurality to the multiwinner setting as well. Naturally, there are also many other rules that would qualify for this title. Our goal in this paper is to seek and study such rules.
Our goal requires some justification. It is widely acknowledged that the singlewinner Plurality rule has only one advantage: simplicity. Apart from this, it is considered a very bad rule-for instance, during the "Voting Power in Practice" workshop, held in 2010 at the Chateau du Baffy, Normandy, the participants (who were experts in voting procedures) were asked to rank election rules. Laslier (2012) reports that Plurality was considered the worst. One of the most serious drawbacks of Plurality is that voters are pressured to vote for one of the two candidates they predict are most likely to win, even if their true most preferred candidate is neither of them; they do that from the fear of casting a 'wasted vote' (Dummett 1984). However, in the multiwinner setting this pressure becomes milder, because there are more candidates to be elected. We view this as one reason why multiwinner analogues of Plurality are worth investigating.
We seek multiwinner analogues of Plurality within the family of committee scoring rules, recently introduced by . This is a natural choice because Plurality belongs to the class of positional scoring rules and committee scoring rules generalize this class to the multiwinner setting. (However, looking for such rules beyond the class of committee scoring rules would not be unthinkable.) Further, we take the following axiomatic approach. We note that Plurality is the only singlewinner scoring rule that satisfies the simple majority criterion, 1 which stipulates that a candidate ranked first by more than half of the voters must be the unique winner of the election. The fixed-majority criterion, introduced by Debord (1993), extends this notion to the world of multiwinner elections by requiring that, if there is a simple majority of voters, each of whom ranks the same k candidates in the top k positions (perhaps in a different order), then these k candidates should form the unique winning committee. 2 Thus, all in all, we seek committee scoring rules that satisfy the fixedmajority criterion.
One can verify that SNTV fails the fixed-majority criterion for all k > 1, but that Bloc does satisfy it. Yet, Bloc is not the only fixed-majority consistent rule within the class of committee scoring rules. In fact, our approach led us to the discovery of a new class of voting rules, which includes all committee scoring rules satisfying the fixed-majority criterion. We call them top-k-counting rules. As in the case of Bloc, they take only the top k preferences of the voters into consideration. Specifically, under a top-k-counting rule, each voter awards points to every committee on the basis of the number of this voter's top k candidates that are members of the committee; the committee with the most points collected from all the voters wins. The function that determines the score of a committee based on the number of committee members ranked in the top k positions by a voter will be called the counting function. As it turns out, the nature of this function (e.g., whether it is convex or concave) has very strong impact on both axiomatic and computational properties of the voting rule it defines.
We provide an (almost) full characterization 3 of fixed-majority consistent committee scoring rules and we analyze the computational complexity of their winner determination problems. More specifically, we obtain the following results: 1. We prove that all committee scoring rules that satisfy the fixed-majority criterion are top-k-counting rules and we establish a condition on the counting function that is necessary and sufficient for the corresponding top-k-counting rule to satisfy the fixed-majority criterion. This condition is a fairly mild relaxation of the classic notion of convexity; in particular, if the counting function is convex then the corresponding top-k-counting rule satisfies the fixed-majority criterion. 2. We show that a number of top-k-counting rules are NP-hard to compute 4 (for example, we show an example of a rule that closely resembles the Bloc rule and is hard even to approximate). There are, however, some polynomial-time computable ones (for example, the Bloc and the Perfectionist rules; the latter one is introduced in this paper). 3. We show that if the counting function is concave, then the corresponding topk-counting rule fails the fixed-majority criterion, but the rule seems to be computationally easier than in the convex case. Specifically, for top-k-counting rules defined via concave counting functions we present a polynomial-time (1 − 1 e )-approximation algorithm and an exact fixed-parameter tractable algorithm (parameterized by the number of voters) for the problem of computing the highestscoring committees. All in all, there is no unique multiwinner analogue of Plurality, even if we restrict ourselves to polynomial-time computable committee scoring rules, but there is a rich class of such rules that deserves further investigation.

Preliminaries
An election is a pair E = (C, V ), where C = {c 1 , . . . , c m } is a set of candidates and V = (v 1 , . . . , v n ) is a collection of voters. Throughout the paper, we reserve the symbol m to denote the number of candidates. Each voter v i is associated with a preference order i in which v i ranks the candidates from his or her most desirable one to his or her least desirable one (we assume the unrestricted domain, i.e., each voter is free to choose any preference order). If X and Y are two (disjoint) subsets of C, then by X i Y we mean that for each x ∈ X and each y ∈ Y it holds that x i y. For a positive integer t, we denote the set {1, . . . , t} by [t].
Single-Winner Voting Rules A single-winner voting rule R is a function that given an election E = (C, V ), outputs a subset R(E) ⊆ C of candidates that are called (tied) winners of this election. There is quite a variety of single-winner voting rules, but for this paper it suffices to consider scoring rules only. Given a voter v and a candidate c, we write pos v (c) to denote the position of c in v's preference order (for example, if v ranks c first then pos v (c) = 1). A scoring function for m candidates is a function γ m : [m] → R + such that for each i ∈ [m − 1] we have γ m (i) ≥ γ m (i + 1) (by R + we mean the set of nonnegative real numbers). Each family of scoring functions γ = (γ m ) m∈N (one function for each possible choice of m) defines a voting rule R γ as follows. Let E = (C, V ) be an election with m candidates. Under R γ , each candidate c ∈ C receives score(c) := v∈V γ m (pos v (c)) points and the candidate with the highest number of points wins. (If there are several such candidates, then they all tie as winners; the term single-winner voting rule refers to the fact that we use the rule to fill-in a single position, and not to indicate that the rule is resolute.) We often refer to the value score(c) as the γ -score of c.
The following scoring functions and scoring rules are particularly interesting. The t-Approval scoring function α t is defined as α t (i) := 1 for i ≤ t and α t (i) := 0 otherwise. (If t is fixed, then the definition of α t does not depend on m; in such cases, α t can both be viewed as a scoring function and as a family of scoring functions.) For example, Plurality is R α 1 , the t-Approval rule is R α t , and the Veto rule is R (α m−1 ) m∈N . The Borda scoring function (for m candidates), β m , is defined as β m (i) := m − i, and R β is the Borda rule, where β = (β m ) m∈N . This notation for these scoring functions will be used throughout the paper.
Multiwinner Voting Rules A multiwinner voting rule R is a function that given an election E = (C, V ) and a number k representing the size of the desired committee, outputs a family R(E, k) of size-k subsets of C; the sets in this family are the committees that tie as winners. As in the case of single-winner voting rules, one may need a tie-breaking rule to get a unique winning committee, but we ignore this aspect in the current paper.
We focus on the class of committee scoring rules, introduced by Elkind et al. (2017) (we remark that the conference version of their paper was published in 2014). Consider an election E = (C, V ) and some committee S of a given size k. Let v be some voter in V . By pos v (S) we mean the sequence (i 1 , . . . , i k ) that results from sorting the set {pos v (c) : c ∈ S} in a strictly increasing order. For example, if C = {a, b, c, d, e}, the preference order of v is a b c d e, and S = {a, c, d}, then pos v (S) = (1, 3,4). If I = (i 1 , . . . , i k ) and J = ( j 1 , . . . , j k ) are two strictly increasing sequences of integers, then we say that I (weakly) dominates J (denoted . For positive integers m and k, k ≤ m, by [m] k we mean the set of all strictly increasing size-k sequences of integers from [m]. Definition 1 ) A committee scoring function for a multiwinner election with m candidates, where we seek a committee of size k, is a function f m,k : [m] k → R + such that for each two sequences I, J ∈ [m] k it holds that if Intuitively, the function f m,k from Definition 1 assigns to each sequence I of k positions the number of points that a committee C gets from a voter v when the members of C stand on exactly the positions of I in the preference order of v.
A committee scoring rule is defined by a family of committee scoring functions f = ( f m,k ) k≤m , which contains one function for each possible choice of m and k. Analogously to the case of single-winner scoring rules, we will denote such a multiwinner rule by R f . Let E = (C, V ) be an election with m candidates and let k, k ≤ m, be the size of the desired committee. Under the committee scoring rule R f , every committee S ⊆ C with |S| = k receives score(S) := v∈V f m,k (pos v (S)) points (for this notation, the election E = (C, V ) will always be clear from the context). The committee with the highest score wins. (If there are several such committees, then they all tie as winners.) Many well-known multiwinner voting rules are, in fact, committee scoring rules. Consider the following examples (we will use them throughout the paper to illustrate various points): 1. The SNTV, Bloc, and k-Borda rules pick k candidates with the highest Plurality, k-Approval, and Borda scores, respectively, and so they are defined through the following scoring functions: Note that f SNTV m,k is defined as a sum of functions that do not depend on either m or k, f Bloc m,k is defined as a sum of functions that depend on k but not m, and f k-Borda m,k is defined as a sum of functions that depend on m but not k. 5 2. The two versions of the Chamberlin-Courant rule that we consider are defined through the following committee scoring functions, respectively: The first one defines the classical Chamberlin-Courant rule (Chamberlin and Courant 1983) and the second one defines what we refer to as the k-Approval Chamberlin-Courant rule [approval-based variants of the Chamberlin-Courant rule were first mentioned by Thiele (1895) and recently they were recalled by Procaccia et al. (2008); they were studied further, for example, by Betzler et al. (2013), Aziz et al. (2017), and Skowron and Faliszewski (2017)]. For brevity, we sometimes refer to the k-Approval Chamberlin-Courant rule as the α k -CC rule.
Intuitively, under the Chamberlin-Courant rules, each voter is represented by the committee member that this voter ranks highest; the Chamberlin-Courant rule chooses a committee S that maximizes the sum of the scores that the voters give to their representatives in S (which characterizes the total satisfaction of the society with the assignment of representatives to the voters). 3. We introduce the Perfectionist rule. This rule is defined through scoring functions of the form: In other words, a voter gives score of 1 to a committee only if its members occupy the top k positions of his or her vote. The rule is not necessarily very appealing, but it has interesting features that will illustrate several points that we make throughout our discussion.
Below we provide an example election where SNTV, Bloc, k-Borda, β-CC, α k -CC, and Perfectionist give different outcomes (with the exception that the results of Bloc and α k -CC are the same).
Example 1 Let us consider the set of candidates C = {a, b, c, d, e, f, g, h} and eight voters with the following preference orders: Let the committee size k be 2. It is easy to compute the winners under the SNTV and Bloc rules. For the former, the unique winning committee is {a, b} (these are the only two candidates that are ranked in the top positions twice), and for the latter it is {e, f } (these are the only two candidates that are ranked among top two positions three times; all the other candidates are ranked there at most twice). A somewhat tedious calculation shows that the unique k-Borda winning committee is {g, h}, which follows since the Borda scores of the candidates a, b, c, d, e, f, g, h are, respectively: 32, 22, 23, 23, 28, 26, 35, 35. Further calculations show that under the (classical) Chamberlin-Courant rule, the unique winning committee is {c, d}. (While it is tedious to compute these results by hand, and indeed we used a computer to find them, the intuition for the k-Borda and Chamberlin-Courant winners is as follows: g and h are always ranked in the middle of each vote, or slightly above, so that they get high total Borda score, whereas c and d are ranked so that one of them is (almost) always ahead of g and h, whereas the other one is in the last position. This way, as representatives, c and d get higher scores than g and h, even though their total Borda score is lower.) On the other hand, it is relatively easy to verify that under α k -CC, the winning committee is {e, f } (its α k -CC score is six; there is no other committee whose members are ranked among the top two positions of six or more voters).
Finally, let us consider the Perfectionist rule. It assigns two points to committee {a, f }, one point to each of {b, c}, {b, d}, {c, e}, {d, e}, {e, g}, and { f, h}, and zero points to all the other committees. Thus, {a, f } is the unique winning committee.
All the above rules are examples of OWA-based committee scoring rules, i.e., their committee scoring functions can be expressed as ordered weighted averages (OWAs) of single-winner scores. Formally, an OWA operator of dimension k is a sequence = (λ 1 , . . . , λ k ) of nonnegative reals 6 and the class of OWA-based rules (due to ) is defined as follows.
Definition 2 Let = ( m,k ) k≤m be a family of OWA operators such that m,k = (λ 1 m,k , . . . , λ k m,k ) has dimension k (one size-k vector for each pair m, k). Let γ = (γ m,k ) k≤m be a family of (single-winner) scoring functions (one scoring function for each pair m, k). Then γ together with define a family of committee scoring functions f = f m,k ( , γ ) such that for each (i 1 , . . . , i k ) ∈ [m k ] we have: The committee scoring rule R f corresponding to the family f is called OWA-based.
Intuitively, the OWA operators specify to what extent the voters care about each member of the committee, depending on how this member is ranked among the other ones. For example, rules with OWA operators of the form (1, . . . , 1), such as SNTV, Bloc, or k-Borda, care about all the committee members equally, whereas rules with OWA operators of the form (1, 0, . . . , 0), such as our two versions of the Chamberlin-Courant rule, care about the top-ranked committee members only. Rules of the first type are called weakly separable, and those of the second type are called representation focused . Naturally, there are also many other choices of OWA operators. For example, the t-Approval variant of the Proportional Approval Voting rule (α t -PAV) uses OWA operators of the form (1, 1 2 , . . . , 1 k ), indicating the decreasing attention the voters pay to their lower-ranked committee members; the Perfectionist rule uses the OWA operator (0, . . . , 0, 1). For more discussions regarding the OWA-based rules, we refer the reader to the works of , Aziz et al. (2017Aziz et al. ( , 2015, and Lackner and Skowron (2017) (the latter ones include a more detailed discussion of PAV; see also the work of Kilgour (2010) for a description of this rule).
Remark 1 We note that in most cases the OWA vectors m,k used to define OWAbased rules do not depend on m. Yet, formally, we allow for such a dependence in order to build the relation between our general framework in which committee scoring functions f m,k might depend on m in any, even not very intuitive, way, and the world of OWA-based rules.
Naturally, there are also committee scoring rules that are not OWA-based. For example,  study the family of p -Borda rules, with committee scoring functions of the following form ( p ≥ 1): In particular, they discuss how the p -Borda rules (for p > 1) are, in a certain sense, between the k-Borda rule (which is simply the 1 -Borda rule) and the classical Chamberlin-Courant rule (which, with slight abuse of notation, could be referred to as ∞ -Borda).

Fixed-majority consistent rules
We are ready to start our quest for finding committee scoring rules that can be seen as multiwinner analogues of Plurality. We begin by describing the fixed-majority criterion that, in our view, encapsulates the idea of "closeness" to Plurality. Then, we provide a class of committee scoring rules-the class of top-k-counting rules-that contains all the rules which satisfy the fixed-majority criterion. Finally, we provide an almost complete characterization of those top-k-counting rules that have the fixed-majority property.

Initial remarks
One of the features that distinguishes Plurality among all the other scoring rules is the fact that it satisfies the simple majority criterion.
Definition 3 A single-winner voting rule R satisfies the simple majority criterion if for every election E = (C, V ) where more than half of the voters rank some candidate c first, it holds that R(E) = {c}.
Importantly, the simple majority criterion indeed characterizes Plurality within the class of single-winner scoring rules. The result is a part of folklore (we provide the proof for the sake of completeness).
Proposition 1 Let γ = (γ m ) m∈N be a family of single-winner scoring functions that defines a scoring rule R γ . Then, R γ satisfies the simple majority criterion if and only if for each m it holds that γ m (1) > γ m (2) = · · · = γ m (m) (that is, if and only if R γ coincides with Plurality).
Proof It is straightforward to verify that if for each m we have γ m (1) > γ m (2) = · · · = γ m (m) then R γ satisfies the simple majority criterion. For the other direction, assume that R γ satisfies the simple majority criterion. This immediately implies that for each m ≥ 2 we have γ m (1) > γ m (m) (otherwise all the candidates would always tie as winners). Hence for m = 2 the result follows.
Let us fix m ≥ 3. For each positive integer n, define the election E n = (C, V n ) with the candidate set C = {c 1 , . . . , c m } and with V n containing: n + 1 voters with preference order c 1 c 2 · · · c m and n voters with preference order c 2 c 3 · · · c m c 1 .
Since R γ satisfies the simple majority criterion, it must be the case that c 1 is the unique R γ -winner for each E n . Further, for a given value of n, the difference between the scores of c 1 and c 2 in E n is: Thus, if it held that γ m (2) > γ m (m), then-for large enough value of n-candidate c 1 would not be a winner of E n . This implies that γ m (2) = · · · = γ m (m). Since γ m (1) > γ m (m), we reach the conclusion that γ m (1) > γ m (2) = · · · = γ m (m).
There are at least two ways of generalizing the simple majority criterion to the multiwinner setting. We choose perhaps the simplest one, the fixed-majority criterion introduced by Debord (1993) (other notions of majority studied by Debord are variants of the Condorcet principle and are incompatible with Plurality and scoring rules in general).

Definition 4
A multiwinner voting rule R satisfies the fixed-majority criterion for m candidates and committee size k if for every election E = (C, V ) with m candidates the following holds: if there is a committee W of size k such that more than half of the voters rank all the members of W above the non-members of W (equivalently: put the candidates from W on top), then R(E, k) = {W }. We say that R satisfies the fixed-majority criterion if it satisfies it for all choices of m and k (with k ≤ m).
Remark 2 Another possible way of extending the simple majority criterion to the multiwinner case would be to say that if a committee W is such that for each c ∈ W a majority of voters rank c among their top k positions (possibly a different majority for each c), then W must be a winning committee. However, consider the following votes over the candidate set {a, b, c}: For k = 2, all three committees, {a, b}, {a, c}, and {b, c} have majority support in the sense just described. We feel that this is against the spirit of the simple majority criterion (since at most one candidate can be ranked on the top position by more than half of the voters, we feel that there should be at most one committee that can claim to have the majority support). Thus, and since we have not found any other convincing ways of generalizing the simple majority criterion to the multiwinner setting, we focus on Debord's fixed-majority notion.
It seems that the fixed-majority criterion is far more important for the multiwinner setting than the simple majority criterion is for the single-winner one. For example, one can verify that the Bloc rule satisfies the fixed-majority criterion and, in fact, this property is crucial in explaining its inner workings (we characterize the Bloc rule as the unique committee scoring rule that is noncrossing monotone and that satisfies the fixed-majority criterion 7 ). This is important as in practice Bloc is among the most commonly used multiwinner rules. Further, the fixed-majority property may be useful when arguing that a given voting rule is appropriate for a setting where the selected committee needs strong legitimization: If a rule fails the fixed-majority property, then it is possible that even though a majority of the voters agree which committee is the best, a different committee is elected (whose legitimacy might be questioned by this majority). 8 While the Bloc rule satisfies the fixed-majority criterion, the SNTV rule does not (it will follow formally from our further discussion). This means that in our axiomatic sense, Bloc is closer to Plurality than SNTV. This is quite interesting since one's first idea of generalizing Plurality would likely be to think of SNTV. Yet, Bloc is certainly not the only committee scoring rule that satisfies our criterion. For example, the Perfectionist rule satisfies the fixed-majority criterion and, indeed, closely resembles Plurality. The following remark strongly highlights this similarity.

Remark 3
Consider a situation where the voters extend their rankings of candidates to rankings of committees in some natural way (see, e.g., the work of Barberà et al. (2004) for an overview of how this may be done). Then, for each voter, the best committee would consist of his or her k best candidates. As a result, running Plurality on the profile of preferences over the committees would give the same result as running Perfectionist over the profile of preferences over the candidates.
Naturally, not all committee scoring rules satisfy the fixed-majority criterion. For example, neither k-Borda nor the Chamberlin-Courant rule do. To see this, it suffices to note that for k = 1 they both become the single-winner Borda rule, which fails the simple majority criterion.

Top-k-counting rules
To characterize the committee scoring rules that satisfy the fixed-majority criterion, we introduce the class of scoring functions that depend only on the number of committee members ranked in the top k positions.

Definition 5
We say that a committee scoring function f m,k : We refer to g m,k as the counting function for f m,k . We say that a committee scoring rule R f is top-k-counting if it can be defined through a family of top-k-counting scoring Both Bloc and Perfectionist are top-k-counting rules. The former uses the linear counting function g m,k (x) = x, while the latter uses the counting function g m,k which is a step-function: g m,k (x) = 0 for x < k and g m,k (k) = 1. Another example of a top-k-counting rule is the α k -CC rule, which uses the counting function g m,k such that g m,k (0) = 0 and g m, Top-k-counting rules have a number of interesting features. First, their counting functions have to be nondecreasing. Second, every top-k-counting rule is OWA-based. Third, every committee scoring rule that satisfies the fixed-majority criterion is top-kcounting. We express these facts in the following two propositions and in Theorem 4.
For the rest of the paper we make the assumption that m ≥ 2k; this assumption is technical as our arguments are greatly simplified by the fact that we can form two disjoint committees of size k. Further, it is also quite natural: one could say that if we were to choose a committee consisting of more than half of the candidates, then perhaps we should rather be voting for who should not be in the elected committee. We are not sure whether this assumption can be dropped.

Proposition 2
Let m ≥ 2k and let f m,k : [m] k → R + be a top-k-counting scoring function defined through a counting function g m,k . Then, g m,k is nondecreasing.
Without the assumption that m ≥ 2k, Proposition 2 would have to be phrased more cautiously, and would speak only of the existence of a nondecreasing counting function. (For example, for m = k, the function g m,k could be arbitrary.)

Proposition 3 Every top-k-counting rule is OWA-based.
Proof Let us consider a top-k-counting rule R f , where f = ( f m,k ) k≤m is the corresponding family of top-k-counting functions defined by a family of counting functions (g m,k ) k≤m . Let us consider one function f m,k from this family. We know that f m,k : [m] k → R + is a top-k-counting scoring function defined through a counting function g m,k so that f m,k (i 1 , . . . , from which we see that R f is OWA-based through the family of OWA operators: and the family of k-Approval scoring functions (γ m,k = α k ).
In the next theorem (and in many further theorems) we speak of a committee scoring rule R f defined through a family of committee scoring functions f = ( f m,k ) 2k≤m . We use this notation as a shorthand for the assumption that the theorem is restricted to the cases where 2k ≤ m.
Theorem 4 Let f = ( f m,k ) 2k≤m be a family of committee scoring functions. Then, if R f satisfies the fixed-majority criterion, then R f is top-k-counting.
Proof Let us fix two numbers m and k such that 2k ≤ m. Consider an election with m candidates, where a committee of size k is to be elected. For each positive integer t such that 0 ≤ t ≤ k we define the following two sequences from [m] k : is a sequence of positions of the candidates where the first t candidates are ranked in the top t positions and the remaining k − t candidates are ranked just below the kth position.
where the first t candidates are ranked just above (and including) the kth position, whereas the remaining k − t candidates are ranked at the bottom.
Among these, be an election with m candidates and 2n + 1 voters. The set of candidates is . . , z k }, and D is a set of sufficiently many dummy candidates so that |C| = m. We focus on two committees, M = X ∪ Y and N = X ∪ Z . The first n + 1 voters have preference order X Y Z D, and the next n voters have preference order Z X D Y . Note that the fixed-majority criterion requires that M be the unique winning committee.
Committee M receives the total score of (n + 1) f m,k (I k ) + n f m,k (J t ), whereas committee N receives the total score of (n + 1) f m,k (I t ) + n f m,k (I k ). The difference between these values is: which, for a large enough value of n, is negative (since, by assumption, we know that is negative). That is, for large enough n, committee M does not win the election and R f fails the fixed-majority criterion.
So, if R f satisfies the fixed-majority criterion, then for every t ∈ {0, . . . , k} we have that f m,k (I t ) = f m,k (J t ). This, however, means that f m,k is a top-k-counting scoring function. To see this, consider some sequence of positions L = ( 1 , . . . , k ) ∈ [m] k where exactly the first t entries are smaller than or equal to k. Clearly, we have that I t L J t and so f m,k which means that f m,k (i 1 , . . . , i k ) depends only on the cardinality of the set {t ∈ [k] : i t ≤ k}. Since m and k were chosen arbitrarily (with 2k ≤ m), this completes the proof.
Unfortunately, the converse of Theorem 4 does not hold: α k -CC, for example, is a top-k-counting rule that fails the fixed-majority criterion.

Example 2 Consider an election
and k = 2. Let the preference orders of the voters be: The fixed-majority criterion requires {a, b} to be the only winning committee, while under α k -CC, other committees, such as {a, c}, have strictly higher scores. (Incidentally, this example also witnesses that SNTV fails the fixed-majority criterion; this is hardly surprising since SNTV is not a top-k-counting rule.)

Criterion for fixed-majority consistency
In this section, we provide a formal characterization of those top-k-counting rules that satisfy the fixed-majority criterion. Together with Theorem 4, this gives an almost full characterization of committee scoring rules with this property.
Theorem 5 Let f = ( f m,k ) 2k≤m be a family of committee scoring functions with the corresponding family (g m,k ) 2k≤m of counting functions. Then, R f satisfies the fixed-majority criterion if and only if for every k, m ∈ N, 2k ≤ m, it holds that: (i) g m,k is not constant, and (ii) for each pair of nonnegative integers k 1 , k 2 with k 1 + k 2 ≤ k, we have that: (Condition (ii) in Theorem 5 is a relaxation of the convexity property for function g m,k and is illustrated in Fig. 1; We discuss this in more detail after the proof of the theorem.) Proof of Theorem 5 Let f m,k be one of the committee scoring functions and g m,k be its corresponding counting function. By Proposition 2, g m,k is nondecreasing so the fact that it is non-constant is equivalent to g m,k (k) > g m,k (0). Moreover, we note that conditions (i) and (ii) imply that for each k with 0 ≤ k ≤ k − 1, we have g m,k (k) > g m,k (k ). To see this we take k 2 = 1 and note that for each k 1 it holds that g m, Let us now show that if for each m and k, g m,k satisfies (ii), then R f has the fixedmajority property. Let E = (C, V ) be an election with n voters and m candidates for which there is a size-k committee M such that a majority of the voters rank all members of M in the top k positions, but M loses to some committee S = M (also of size k). That is, we have score(S) ≥ score(M). Let ξ be a rational number, 1 2 < ξ ≤ 1, such that exactly ξ n voters rank all the members of M in the top k positions; we will refer to these voters as M-voters and to the others as non-M-voters.
Without loss of generality, we can assume that all the non-M-voters have identical preference orders. Indeed, if it were the case that for some two non-M-voters v i and v j , then we could replace the preference order of v j with that of v i and increase the advantage of S over M. If for all non-M-voters this difference were the same, then we could simply pick the preference order of one of them and assign it to all the other ones.
Let k 1 , k 2 , k 3 , and k 4 be four numbers such that: 1. k 1 is the number of candidates from S ∩ M that the non-M-voters rank among their top k positions, 2. k 2 is the number of candidates from S\M that the non-M-voters rank among their top k positions, 3. k 3 is the number of candidates from C\(S ∪ M) that the non-M-voters rank among their top k positions, and 4. k 4 is the number of candidates from M\S that the non-M-voters rank among their top k positions.
Without loss of generality, we can assume that k 4 = 0 and that |S\M| = k 2 (since m ≥ 2k, we can replace all members of M\S with candidates from C\M, and, similarly, we can ensure that all members of S\M are ranked among the top k positions by non-M-voters; these changes never decrease the score of S relative to that of M).
In effect, we have that k 1 + k 2 + k 3 = k and, since |S ∩ M| + |S\M| = k, we have that |S ∩ M| = k − k 2 . We can assume that k 2 > 0 as otherwise we would have S = M. Given this notation, the difference between the scores of M and S is: where the second equality holds due to rearranging of terms, and the final inequality is an immediate consequence of the assumptions regarding the value of ξ and the properties of g m,k (namely, that g m,k (k) − g m,k (k − k 2 ) ≥ g m,k (k 1 + k 2 ) − g m,k (k 1 ) and that g m,k (k) − g m,k (k − k 2 ) > 0). This, however, contradicts the assumption that score(S) ≥ score(M) and, so, R f satisfies the fixed-majority criterion.
We now consider the other direction. For the sake of contradiction, let us assume that R f satisfies the fixed-majority criterion but that there exist m and k such that it is not the case that conditions (i) and (ii) are both satisfied. If condition (i) is not satisfied and g m,k is a constant function, then R f fails the fixed-majority criterion because it always outputs all the subsets of size k, independently of the voters' preferences. Thus we assume that g m,k is not constant. Thus, suppose that condition (ii) does not hold and there exist k 1 and k 2 with k 1 + k 2 ≤ k such that g m,k (k) − g m,k (k − k 2 ) < g m,k (k 1 + k 2 ) − g m,k (k 1 ). We form an election with m candidates, c 1 , . . . , c m , and 2n + 1 voters (we describe the choice of n later). The first n + 1 voters have preference order: c 1 c 2 · · · c m , and the remaining n voters have preference order: Since R f satisfies the fixed-majority criterion, in this election it outputs the unique winning committee M = {c 1 , . . . , c k }. However, consider committee S: Since m ≥ 2k, the difference between the scores of M and S is: Since g m,k (k) − g m,k (k − k 2 ) < g m,k (k 1 + k 2 ) − g m,k (k 1 ), we observe that for large enough n the difference score(M)−score(S) becomes negative. This is a contradiction showing that (ii) holds.
Let us take a step back and consider what condition (ii) from Theorem 5 means (recall Fig. 1). Intuitively, it resembles the convexity condition, but 'focused' on g m,k (k) (see the explanation below).
Definition 6 Let g m,k be a counting function for some top-k-counting function f m,k : [m] k → R + . We say that g m,k is convex if for each k such that 2 ≤ k ≤ k, it holds that: On the other hand, we say that g is concave if for each k with 2 ≤ k ≤ k it holds that: Using inductive reasoning, we see that the above definition of a convex top-kcounting function is equivalent to requiring that for each k , k , and d such that k ≤ k ≤ k, k − d ≥ 0, and k − d ≥ 0, it holds that: Condition (ii) of Theorem 5 is of the same form, except that we fix k to be k (i.e., we 'focus on g m,k (k)'), set d = k 2 , and set k = k 1 + k 2 .
The notions of convexity and concavity are standard, but allow us to express many features of top-k-counting rules in a very intuitive way. For example, the following corollary is an immediate consequence of Theorem 5.
Corollary 6 Let f = ( f m,k ) 2k≤m be a family of top-k-counting committee scoring functions with the corresponding family (g m,k ) 2k≤m of counting functions. The following statements hold: 1. if g m,k are convex, then R f satisfies the fixed-majority criterion, and 2. if g m,k are concave but not linear (that is, R f is not Bloc) then R f fails the fixed-majority criterion.
The counting function for the Bloc rule is linear (and, thus, both convex and concave), and the counting function for the Perfectionist rule is convex, so these two rules satisfy the fixed-majority criterion. On the other hand, the counting function for α k -CC is concave and, so, this rule fails the criterion (as we observed in Example 2). (It may be helpful to remark here that committee scoring rules are uniquely represented by their committee scoring functions, up to affine transformations; this result is provided in the technical report version of the work of .) By Proposition 3, a family of concave counting functions g m,k corresponds to a nonincreasing OWA operator and a family of convex counting functions corresponds to a nondecreasing one.  provided evidence that rules based on nonincreasing OWA operators are computationally easier than those based on general OWA operators (while computing the exact winning committees tends to be computationally hard in both cases, there are, for example, polynomial-time constant-factor approximation algorithms whenever the operators are nonincreasing; unless P = NP, such algorithms do not exist for many rules based on the other OWA operators). In Sect. 4 we show that this seems to be the case for top-k-counting rules as well, but we also provide a striking example highlighting a certain dissimilarity.

Characterization of Bloc within committee scoring rules
We conclude this section by noting that Theorems 4 and 5, together with a result of , suffice to characterize Bloc within the class of committee scoring rules. To present this result, we need the following definition of Elkind et al. : Definition 7 A multiwinner rule R is noncrossing-monotone if the following holds: Whenever committee W of size k is winning in some election E, then W also is winning in every election E resulting from shifting some member c of W one position forward in some vote (provided that c does not pass any other member of W ).  have shown that a committee scoring rule is noncrossing monotone if and only if it is weakly separable, that is, if and only if its scoring functions f = ( f m,k ) k≤m are of the form: where γ = (γ m,k ) k≤m is a family of single-winner scoring functions. Since the scoring functions of the Bloc rule are the only top-k-counting scoring functions of this form [this also follows by uniqueness of representation of committee scoring rules ], by Theorems 4 and 5 we get the following corollary.
Corollary 7 Bloc is the only committee scoring rule that is both fixed-majority consistent and noncrossing monotone.
This corollary calls for two comments. First, the reader may complain that Theorems 4 and 5 assume that the number of candidates is at least twice as large as the committee size, but in Corollary 7 we do not make this assumption. Indeed, Theorems 4 and 5 suffice for Corollary 7 only for the case where 2k ≤ m. For the case where 2k > m, one can show that the result still holds by using the fact that noncrossing monotonicity guarantees that our committee scoring rule have scoring functions of the form (1). Indeed, it suffices to consider an election with: n + 1 votes of the form c 1 c 2 · · · c k c k+1 · · · c m , and n votes of the form c k+1 c 1 · · · c k−1 c k+2 · · · c m c k .
If it were the case that γ m,k (1) > γ m,k (k) then, for sufficiently large n, candidate c k+1 would have higher γ m,k -score than c k in the above election and, in consequence, the committee {c 1 , . . . , c k } would not be winning. Thus our rule would not be fixedmajority consistent. The same would hold if we had that γ m,k (1) = · · · = γ m,k (k), but γ m,k (k + 1) > γ m,k (m): For sufficiently large n, c k+1 would have higher γ m,k score than c k . Naturally, if γ m,k (1) = · · · = γ m,k (m), then all committees would always win and the rule would not be fixed-majority consistent either. Thus the only functions γ m,k remaining are such that γ m,k (1) = · · · = γ m,k (k) > γ m,k (k + 1) = · · · = γ m,k (m). Such functions generate exactly the Bloc rule, which is fixed-majority consistent.  characterize Bloc as the only committee scoring rules that is top-k-counting and weakly separable, which is the same result as ours, but phrased in terms of syntactic properties of scoring functions and not in terms of axiomatic properties.
tions. Throughout this section, we focus on committee scoring functions of the form f m,k : [m] k → N, that is, on functions that always return nonnegative integers as scores. This is a technical assumption, motivated by the fact that representing arbitrary real numbers on a computer can be problematic. To avoid confusion, we mention this assumption explicitly in each relevant theorem.
Remark 4 For a committee scoring rule R f , when we say that this rule is NP-hard to compute, we formally mean that, given an election E = (C, V ), a committee size k, and a nonnegative integer T , the problem of deciding if there exists a committee S of size k whose score is at least T is NP-hard. Indeed, if we were able to compute an R f winning committee of size k in polynomial time, then we could solve this decision problem in polynomial time as well, by checking if the score of the winning committee is at least T (provided that f were polynomial-time computable). Conversely, if we knew that our decision problem were NP-hard, then we would also know that the ability to compute winning committees under R f implies the ability to solve NP-hard problems.
We start by considering several examples. It is well-known that Bloc winners can be computed in polynomial time; this is so since one can compute the score of each candidate separately. It turns out that the same holds for the Perfectionist rule, albeit following different reasoning.

Proposition 8 Both Bloc and Perfectionist winners are computable in polynomial time.
Proof The case of Bloc is well-known (to form a winning committee of size k it suffices to pick k candidates with the highest k-Approval scores). To find a size-k winning committee under the Perfectionist rule, for each voter v we consider the set of his or her top-k candidates as a committee, and compute the score of that committee in the election. We output those committees-among the considered ones-that have the highest score. Correctness follows by noting that the committees considered by the algorithm are the only ones with nonzero scores.
While the above result is very simple, it is also very interesting. For example, Perfectionist is the first example of a polynomial-time computable committee scoring rule that is not weakly separable [see the discussions of  and ]. Further, it stands in sharp contrast to the results of . By Proposition 3, Perfectionist is defined through the OWA operator (0, . . . , 0, 1), and Skowron et al. have shown that, in general, rules defined through this operator are NP-hard to compute and very difficult to approximate. Their result, however, relies on the fact that voters can approve any number of candidates, while in our case they must approve exactly k of them. This shows very clearly that even though top-k-counting rules are OWA-based, we cannot simply carry-over the computational hardness results of  or Aziz et al. (2015) to our framework.
We can generalize Proposition 8 to rules that are, in some sense, similar to Perfectionist. To this end, and to facilitate our later discussion regarding the com-plexity of top-k-counting rules, we define the following property of counting functions.
Definition 8 Let g m,k be a counting function for a top-k-counting function f m,k : [m] k → N. We define the singularity of g m,k , denoted by sing(g m,k ), to be Loosely speaking, sing(g m,k ) is the smallest integer in {2, . . . , k} for which the differential of g m,k changes. For Bloc (which is an exception) we define sing(g m,k ) to be ∞, since the differential is a constant function. Naturally, for all other non-constant rules, the singularity is finite. For example, for Perfectionist we have sing(g m,k ) = k.
We generalize the polynomial-time algorithm for Perfectionist to similar rules, for which the value sing(g m,k ) is close to k. Proof Let the input consist of election E = (C, V ) and positive integer k, and let W be a winning committee in R(E, k). We assume that q < k 2 (if it were not the case, then k ≤ 2q would be small and we could solve the problem using brute-force). We consider two cases: (1) there is at least one voter that has at least sing(g m,k ) of his or her top k candidates in W ; (2) every voter has less than sing(g m,k ) of his or her top k candidates in W .
If case (1) holds, then we can compute W (or some other winning committee) by checking, for each voter v, all the committees that consist of at least sing(g m,k ) candidates that v ranks among his or her top k positions. Since k − sing(g m,k ) ≤ q, the number of committees that we have to check for each voter is: which is a polynomial in k and m. The above inequality requires some care: We have that sing(g m,k ) > k 2 (because k − sing(g m,k ) ≤ q < k 2 ) and, in effect, we have that for each t ∈ {sing(g m,k ), . . . , k} it holds that k t = k k−t ≤ k k−sing(g m,k ) and m k−t ≤ m k−sing(g m,k ) . If case (2) holds, then from the fact that g m,k (x) − g m,k (x − 1) is a constant for x ≤ sing(g m,k ), we infer that g m,k (x) is effectively linear. Then, it suffices to compute the winning committee using the Bloc rule. While we do not know which of the two cases holds, we can compute the two committees, one as in case (1) and one as in case (2), and output the one with the higher score (or either of them, in case of a tie).

Example 3 Consider the following committee scoring function:
As a simple application of Proposition 9, we get that the committee scoring rule R f defined through f is polynomial-time computable. This rule can be seen as a variant of Bloc, where a voter gives additional one bonus point to a committee if he or she approves of all its members. By Corollary 6, this rule is fixed-majority consistent. It is also interesting to consider the rule which is defined through the following committee scoring function: The corresponding rule is also polynomial-time computable (it suffices to compute an SNTV winning committee, and compare it with such committees whose all members stand on first k positions in some voter's preference ranking), but it is not a top-kcounting rule and, so, it fails the fixed-majority criterion.
Yet, as one might expect, not all top-k-counting rules are polynomial-time solvable and, indeed, most of them are not (under standard complexity-theoretic assumptions). For example, α k -CC is NP-hard (this follows quite easily from Theorem 1 of Procaccia et al. (2008); we include a brief proof to substantiate the discussion and give the reader some intuition).
Proposition 10 For α k -CC it is NP-hard to decide whether or not there exists a committee with at least a given score (recall that k in α k -CC is the committee size and, thus, is part of the input).
Proof sketch The NP-hardness follows easily from a standard reduction from the Exact Cover by 3-Sets problem, abbreviated as X3C. In an instance of X3C we are given a family of m subsets, S 1 , . . . , S m , each of cardinality 3, chosen from a given universal set U = {x 1 , . . . , x 3n }, and we ask if there are n subsets from the family whose union is U . Additionally, we assume that each element of U belongs to at most three subsets [it is well-known that this variant of X3C remains NP-complete (Garey and Johnson 1979)].
Given an instance of X3C, we create a candidate for each subset and a voter for each element of U . Voters rank the subsets to which they belong in their top positions, then they rank some n dummy candidates (different ones for each voter), and then all the remaining candidates (in some arbitrary, easy to compute, order). We ask for a committee of size k = n (and we assume that n ≥ 3; this is a technical assumption as for n = 1 and n = 2 our construction is formally incorrect 9 ). There is a winning committee with score 3n if and only if the answer for the input instance is "yes." We generalize the above NP-hardness result to the case of convex top-k-counting rules R f for which there is some constant c such that for each k and m it holds that k − sing(g m,k ) ≥ k/c (that is, to the case of convex counting functions for which the differential changes 'early'). The proof of this result is fairly technical and is available in Appendix A.
Theorem 11 Let R f be a top-k-counting rule defined through a family f of top-kcounting functions f m,k : with the corresponding family of counting functions (g m,k ) k≤m that do not depend on m, g m,k = g k , and such that: is computable in polynomial time with respect to k (that is, there is a polynomial time algorithm that given x and k outputs g k (x)). Moreover, for each k, g k (k) is polynomially bounded in k.

2.
There is a constant c such that, for each size of committee k greater than some fixed constant k 0 , g k is convex and k − sing(g k ) ≥ k/c.
Then, deciding if there is a committee with at least a given score is NP-hard for R f .
Let us now discuss the assumptions of the theorem, where they come from and why we believe they are natural (or necessary).
First, the assumption that the counting functions are computable in polynomial time is standard and clear. Indeed, it would not be particularly interesting to seek hardness results if already the counting functions were hard to compute.
Second, we believe that the assumption that the counting functions g m,k do not depend on m is reasonable. For example, it is quite intuitive that adding some candidates that all the voters rank last should not have any effect on the committee selected by a top-k-counting rule. (The assumption is also very helpful on the technical level. Our construction uses a number of dummy candidates that depends on the values of the counting function. If the values of the counting function depended on the number of candidates, we might end up with a very problematic, circular dependence.) Third, the assumption that there is a constant c such that for any large enough committee size k we have k − sing(g k ) ≥ k/c says that the function "shows its convex behavior" early enough. As shown in Proposition 9, some assumption of this form is necessary (though there is still a gap, since the bounds from the theorem and from Proposition 9 do not match perfectly), and it is the core of the theorem.
Finally, perhaps the least intuitive assumption in this theorem is the requirement that for a given committee size k, the highest value of the counting function is polynomially bounded in k. The reason for having it is that, if the highest value were extremely large (say, exponentially large with respect to k) then, for sufficiently few voters (for example, polynomially many), the rule might degenerate to a polynomial-time computable one (for example, it might resemble the Perfectionist rule for this case). Exactly to avoid such problems, in our proof we use a number of voters that depends on g k (k). Our reduction would not run in polynomial time if g k (k) were superpolynomial.
A result similar to Theorem 11, but for concave rules, is possible as well [and, in essence, follows from the proofs of  and Aziz et al. (2015)]. Thus, in general, top-k-counting functions tend to be NP-hard to compute. What can we do if we need to use them anyway? There are several possibilities. Next we consider approximability and fixed-parameter tractability as possible approaches.

Approximability
First, for concave top-k-counting rules we can obtain a constant-factor approximation algorithm [we deduce it from the result of , which-in essence-boils down to optimizing a submodular function using the seminal results of Nemhauser et al. (1978)]. In particular, the next result applies to the α k -PAV rule (that is, to the top-k-counting rule based on the OWA operators of the form (1, 1 2 , 1 3 , . . . , 1 k ); recall its discussion from Sect. 2).

Theorem 12 Let R f be a top-k-counting rule defined through a family f of (polynomial-time computable) top-k-counting functions f m,k :
[m] k → N with corresponding counting functions g m,k that are concave. Then there is a polynomial-time algorithm that, given an election E and a committee size k, computes a committee W of size k, whose score, under R f , is at least a (1 − 1 e ) fraction of the score of the winning committee(s) from R f (E, k).
Proof This follows from the fact that concave top-k-counting rules correspond to OWA-based rules that use nonincreasing OWA operators. For such rules, there is a (1 − 1 e )-approximation algorithm for computing the score of the winning committees and for computing a committee with such a score (Skowron et al. 2016, Theorem 4).
Such a general result for convex counting functions seems impossible. Let us consider a convex counting function g m,k (x) = max(x − 1, 0) that is nearly identical to the linear counting function used by Bloc. Let us refer to the top-k-counting rule defined by (g m,k ) k≤m as NearlyBloc. If we had a polynomial-time constant-factor approximation algorithm for NearlyBloc, we would have a constant-factor approximation algorithm for the Densest at most K Subgraph problem (abbreviated as DamkS; see below). Taking into account the results of Khuller and Saha (2009), Raghavendra and Steurer (2010), and Alon et al. (2011), this seems very unlikely.
Given a graph G, we refer to its sets of vertices and edges as V (G) and E(G), respectively. The density of a graph G is defined as δ = |E(G)| |V (G)| .

Definition 9
In the Densest at most K Subgraph problem, DamkS, we are given a graph G and we ask for a subgraph of G of the highest possible density with at most K vertices.
The proof of the next theorem is available in Appendix B.

Theorem 13 There is no polynomial-time constant-factor approximation algorithm for the problem of computing the score of a winning committee under NearlyBloc, unless such an algorithm exists for the DamkS problem.
Nonetheless, for top-k-counting rules that are not too far from α k -CC, we have a polynomial-time approximation scheme (PTAS), that is, an algorithm that can achieve any desired approximation ratio, as long as the number of candidates is not too large relative to the committee size. This result holds even for rules that are not concave (provided they satisfy the conditions of the theorem); the result follows by noting that our voters have non-finicky utilities . Proof We use the concept of non-finicky utilities provided by . Adapting their terminology, we say that a single-winner scoring function γ m : [m] → N (for elections with m candidates) is (ξ, δ)-non-finicky for ξ, δ ∈ [0, 1], if each of the highest δm numbers in the sequence γ m (1), . . . , γ m (m) is greater or equal to ξγ m (1). It is easy to see that α k is (1, k m )-non-finicky. Consider an input election E = (C, V ) with m candidates, and committee size k, such that m = o(k 2 ). By Proposition 3, we know that f m,k is OWA-based, that it uses some OWA operator m,k that has nonzero entries on the top positions only, and that it uses scoring function α k (which is a (1, k m )-non-finicky). Thus, due to , there is a polynomial-time 1 − exp − k 2 m 2 -approximation algorithm for computing the score of a winning committee under f . Using the assumption that m = o(k 2 ), the approximation ratio of the algorithm is: This completes the proof.
Theorem 14 is quite remarkable even for the case of α k -CC (let alone that it applies to a somewhat more general set of rules). Indeed, generally, variants of the Chamberlin-Courant rule that use some sort of approval scoring function are hard to compute (Procaccia et al. 2008;Betzler et al. 2013) and the best possible approximation ratio for a polynomial-time algorithm, in the general case, is 1 − 1 e [this result was observed by Skowron and Faliszewski (2017) and follows from results for the MaxCover problem (Feige 1998)]. This upper bound, however, relies on the fact that there is no connection between the size of the input election, the committee size, and the number of candidates that each voter approves. We obtain a PTAS because we assume that for the committee size k each voter approves of k candidates, and that the number m of candidates is such that m = o(k 2 ).
One may ask how likely it is that this last assumption holds. As a piece of anecdotal evidence, we mention that in the 2015 parliamentary elections in Poland, there were k = 460 seats in the parliament and m ≈ 8000 candidates. In this case, m/k 2 ≈ 0.0378, which suggests that our algorithm could be effective (provided that the voters could say which k candidates they approve of; likely, this would require some sort of simplified ballots, for example, allowing one to approve blocks of candidates).

Fixed-parameter tractability
If one were not interested in approximation algorithms but still wanted to use top-k-counting rules, then one might seek fixed-parameter tractable algorithms. In parameterized complexity we concentrate on some distinguished parameter of the problem, such as the number of candidates or the number of voters. We say that a parameterized problem is fixed-parameter tractable (is in FPT) if there is an algorithm that, given an instance of this problem of size n with parameter t, computes an answer for the problem in time f (t)n O(1) , where f is some computable function (such an algorithm is also said to run in FPT time with respect to parameter t). For a detailed description of parameterized complexity, we point the readers to the books by Downey andFellows (1999), Niedermeier (2006), and Cygan et al. (2015).
We start with a simple observation, namely that a winning committee can be computed for every top-k-counting rule in FPT time for the parameterization by the number of candidates.
Proposition 15 Let R f be a top-k-counting committee scoring rule, where the family f = ( f m,k ) k≤m ( f m,k : [m] k → N) is defined through a family of counting functions (g m,k ) k≤m (that are computable in FPT time with respect to m). There is an algorithm that, given a committee size k and an election E, computes a winning committee from R f (E, k) in FPT time with respect to the number m of candidates.
Proof The algorithm simply computes the score of every possible committee and outputs the one with the highest score. With m candidates and committee size k, the algorithm has to check m k = O(m m ) committees, and checking each committee requires FPT time only.
For rules based on concave counting functions we can also provide a far less trivial FPT algorithm for the parameterization by the number of voters (the proof, which uses a somewhat technical trick on top of solving a mixed integer linear program is available in Appendix C). The algorithm applies, for example, to the α k -PAV rule, which uses OWA operators of the form (1, 1 2 , 1 3 , . . . , 1 k ), so its counting functions are of the form g Theorem 16 Let R f be a top-k-counting committee scoring rule, where the family f = ( f m,k ) k≤m ( f m,k : [m] k → N) is defined through a family of concave counting functions (g m,k ) k≤m (that are polynomial-time computable). There is an algorithm that, given a committee size k and an election E, computes a winning committee from R f (E, k) in FPT time with respect to the number n of voters.
To summarize, it appears that most (but certainly not all) top-k-counting rules are NP-hard to compute. For top-k-counting rules based on concave counting functions, there are good polynomial-time approximation algorithms and some exact FPT algorithms. On the other hand, for rules based on convex functions the situation is much more difficult. Aside from several algorithms that do not depend on concavity or convexity of the counting function (for instance the algorithms from Theorem 14 and Proposition 15), so far we only have evidence for computational hardness.

Related literature
The rules considered in this paper form a subfamily of the OWA-based rules of . A specific subclass of OWA-based rules-when voters express their preferences in the form of approval ballots-has been already mentioned in the early work of Thiele (1895). More recently, Aziz et al. (2017), Brill et al. (2017), andSánchez-Fernández et al. (2017) analyzed selected axiomatic properties of the Thiele methods, and Aziz et al. (2015) studied their computational complexity. For a more general overview of approval-based multiwinner rules we refer the reader to the book by Kilgour (2010). It is also worth noting that there exist other OWA-based approaches to multiwinner voting [see, e.g., the work of Elkind and Ismaili (2015)], which, however, do not directly apply to our setting.
More generally, the class of OWA-based rules is a subclass of the class of committee scoring rules . Committee scoring rules have been recently axiomatically characterized by  classified voting rules within this class in the form of a hierarchy. The studies of axiomatic properties of other committee scoring rules also include the work of Debord (1992), who characterized k-Borda voting rule. There is also a substantial literature describing axiomatic properties of other types of multiwinner rules-for an overview of this literature we refer the reader to the work of  and to the survey of .
Establishing the complexity of winner determination under various multiwinner rules is an active area of research. These studies were pioneered by Procaccia et al. (2008), who proved that computing winners under the Chamberlin-Courant committee scoring rule is NP-hard 10 and, in consequence, motivated many researchers to seek ways of circumventing this result. For example, Betzler et al. (2013) have shown that the rule is polynomial-time computable for the case of single-peaked preferences and Yu et al. (2013) have done the same for single-crossing ones [Elkind and Lackner (2015), Skowron et al. (2015b), Peters (2018), and Peters and Lackner (2017) provided further generalizations and improvements to these results]. Betzler et al. (2013) studied the problem from the perspective of parameterized complexity theory, whereas Lu and Boutilier (2011) analyzed the possibility of approximation and proved that a simple greedy procedure guarantees the approximation ratio of 1 − 1 /e (the ratio relates the scores of the winning committee and the one provided by the algorithm). Later, Skowron et al. (2015a) improved this result by showing a polynomial-time approximation scheme. Oren and Lucier (2014) proved that if the voters arrive in a random order then the greedy algorithm can be easily adapted to the online setting, preserving the approximation ratio arbitrarily close to 1 − 1 /e; they also observed that for certain specific distributions of votes this approximation ratio can actually improve. Skowron and Faliszewski (2017) studied FPT approximation algorithms of the approval-based Chamberlin-Courant rule and Faliszewski et al. (2016) showed that often in practice the quality of approximation can be improved by employing certain clustering algorithms.
So far, analysis of the complexity of other committee scoring rules received far less attention, but this seems to be changing quickly. For example, it was shown that finding winners according to the proportional approval voting rule (the PAV rule), another approval-based committee scoring rule, is NP-hard (Aziz et al. 2015;, but there exist good approximation algorithms for the problem Skowron 2016;Byrka et al. 2017). The complexity of other selected subclasses of committee scoring rules has been studied by  and by . There also exists a literature studying the computational complexity of other multiwinner rules, which do not belong to the class of committee scoring rules, such as Minimax Approval Voting (MAV): finding winners under MAV is NP-hard (LeGrand 2004), yet there exists a PTAS for the problem (Byrka and Sornat 2014). Parameterized complexity and parameterized approximations of the rule were considered by Misra et al. (2015) and Cygan et al. (2017). The computational complexity of these and other important issues pertaining to MAV were considered by Baumeister et al. (2010Baumeister et al. ( , 2016, . Our work regards the model of multiwinner elections where the voters rank the candidates and it is the voting rule's task to (implicitly) derive rankings of the committees (in a systematic way, according to the principles that underlie the given rule). Another approach, pioneered by Fishburn (1981a, b), is to require that the voters rank the committees explicitly. This approach is useful when there are dependencies between the candidates that are hard (or impossible) to capture within simple preference orders (e.g., when it is important that all members of an elected committee can work together), but can be used directly only in very limited settings (for example, there are 252 committees of five out of ten candidates; it would be unreasonable to ask voters to rank them all). In other cases, one has to rely on concise means of expressing voters' preferences, such as the formalism of CP-nets (Boutilier et al. 2004). Multiwinner elections of this type are often studied within the area of voting in combinatorial domains (Lang and Xia 2015).

Conclusions and further research
Aiming at finding a multiwinner analogue of the single-winner Plurality rule, we have shown that the answer is quite involved. While it is tempting to view SNTV as a natural analogue of Plurality, a closer look reveals that it fails the fixed-majority criterion (which Plurality satisfies in the single-winner setting). We have found that, among all committee scoring rules, only the top-k-counting rules-a class of rules we have defined in this paper-have a chance of satisfying the fixed-majority criterion, and we have characterized when this happens. Specifically, we have shown that the committee scoring rules which satisfy the fixed-majority criterion are exactly those top-k-counting rules whose counting functions satisfy a relaxed variant of convexity.
For example, the Bloc and Perfectionist rules both satisfy the fixed-majority criterion and, so, in some sense, they are among the multiwinner analogues of Plurality (for the Perfectionist rule this goes quite deep). On the other hand, a variant of the Chamberlin-Courant rule based on the k-Approval scoring function is top-k-counting, but fails the fixed-majority criterion.
We believe that it is very interesting to focus on top-k-counting rules based either on convex or on concave counting functions. These two classes of rules are different in some interesting ways: top-k-counting rules based on convex counting functions are fixed-majority consistent, but seem very hard to compute (with a few exceptions); this stands in contrast to top-k-counting rules based on concave counting functions, which fail the fixed-majority criterion (the borderline case of Bloc rule excluded), but are much easier to compute (typically still NP-hard, but with constant-factor polynomialtime approximation algorithms and FPT algorithms for the parameterization by the number of voters).
Our work leads to a number of open questions. In the axiomatic direction, it would be interesting to consider notions analogous to the fixed-majority criterion for the setting where voters do not provide preference orders but, instead, simply indicate which candidates they do or do not approve. On the computational front, it would be interesting to find more powerful algorithms for computing winning committees under various top-k-counting rules (e.g., for the α k -PAV rule).

A Proof of Theorem 11
We prove NP-hardness of the problem by giving a reduction from the Clique problem on regular graphs (a graph is regular if all its vertices have the same degree). In the Clique problem we are given a graph G and an integer h, and we ask if there exists a set of h pairwise adjacent vertices in G (such a set of vertices is referred to as a size-h clique). The problem remains NP-complete when restricted to regular graphs (Garey and Johnson 1979).
Let G be the input regular graph, let h be the size of the clique sought for, and let δ be the common degree of G's vertices. If h > δ + 1, then, of course, the graph does not contain a size-h clique and we output a fixed "no"-instance of our problem. Otherwise, we output an instance according to the following construction (intuitively, since each g k is convex, the rule promotes situations where voters rank many members of the committee among their top k candidates; we exploit this fact).
We set the committee size k to be (c + 2)h (recall that c is defined in the theorem statement). Since g k does not depend on the number of candidates in the election, this fixes the counting function that we work with and we will denote it g. If k ≤ k 0 (recall that k 0 is defined in the statement of the theorem), then we solve the input instance using brute force in polynomial time and output either a fixed "yes"-instance or a fixed "no"-instance, depending on the result.
We note that for each i, 1 ≤ i ≤ sing(g), all the values g(i) − g(i − 1) are equal and, without loss of generality, we can assume them to either all be 0s or all be 1s (if this were not the case, we could scale g appropriately). Similarly, since g is convex, we can assume that g(sing(g)) − g(sing(g) − 1) > 1. We note that k − sing(g) ≥ k/c = (c + 2)h/c > h and, so, sing(g) < k − h.
We form an election with the following candidates: Further, we create 2g(k) · (m + h) · g(k) filler voters, who rank the following candidates in the top k positions: 1. All the edge-filler candidates. 2. All the general-filler candidates. 3. Sufficiently many dummy candidates (different dummy candidates for each filler voter).
(The role of the 2g(k) multiplicity factor regarding both the edge voters and the filler voters is to ensure that the best committee does not contain any of the dummy candidates; this will become clear later in the proof.) We ask whether there is a committee W whose score is at least T = T 1 +T 2 +T 3 +T 4 , where: Note that each T i , 1 ≤ i ≤ 4, is nonnegative (for T 4 this is due to convexity of g). The meaning of these values will become clear throughout the proof. This finishes the construction. Due to the assumptions regarding the counting function, the reduction is polynomial-time computable.
Let us now argue that the reduction is correct. First, we claim that if a committee W has a score of at least T , then it must contain all the edge-filler candidates and all the general-filler candidates. We note that altogether we have k − h edge-filler and general-filler candidates. Consider some committee W that contains k − h − x candidates of these two types, where x ≥ 1. This means that W contains at most h + x dummy candidates.
Let y be the number of filler voters that rank at least k − h members of W among their top k positions. Let us call these filler voters well-satisfied. For each of the well-satisfied filler voters, the members of W ranked on top k positions are (a) the k − h − x edge-filler and general-filler candidates from W , and (b) at least x unique dummy candidates. Thus it must hold that x y ≤ h +x and, so, y ≤ h x +1. If x ≥ 2, then it must be that y ≤ h. If x = 1, then this inequality gives us that y ≤ h + 1. However, for y to be h + 1, W would have to consist of k − h − 1 edge-filler and general-filler candidates and h + 1 dummy candidates. Each of these dummy candidates would have to be ranked among top k positions by exactly one of the y well-satisfied filler voters. This would mean that for each edge voter, the only members of W ranked by this voter among top k positions would be (some of) the edge-filler candidates. Consequently, all the edge voters would rank at most k − h − 1 members of W among their top k positions. In either case (that is, irrespective if x = 1 or x ≥ 2), we can upper-bound the score of committee W by assuming that there are 2g(k) · (m + h) · g(k) − h voters that assign score g(k − h − 1) to W and 2g(k) · m + h voters that assign score g (k) to it. In effect, we have the following inequalities (also see the explanations below): The second inequality holds because g(k − h) > g(k − h − 1) + 1 (which holds due to the fact that g is convex, g(sing(g)) − g(sing(g) − 1) > 1, and sing(g) < k − h). Further inequalities hold due to simple calculations. Due to the above reasoning, we can assume that every committee with score at least T contains all the k − h filler candidates. Consider some committee that contains all the k − h filler candidates. We claim that if this committee contains some dummy candidates then there is another committee with a higher score. Why is this so? Assume that the committee contains some z dummy candidates (z ≤ h). If we simply removed these dummy candidates (obtaining a smaller committee) then we would lose at most z · g(k) points. Then, we could bring the committee back to its intended size by performing the following operations sufficiently many times: Either adding to the committee a single vertex candidate (already connected by an edge to one from the committee) or adding to the committee two vertex candidates connected by an edge. Each of these actions increases the score of the committee by at least 2g(k) g(sing(g)) − g(sing(g) − 1) > 2g(k) (because for each edge there are 2g(k) corresponding edge voters). Thus, we would obtain a committee with a score higher than the one we have started with. (Note that, technically, there might be no sequence of operations that brings our committee back to size k, but this would only happen if the graph had too few edges to contain a clique of size h and we could recognize that this is the case in polynomial time.) Let W be some winning committee that contains all the k − h filler candidates, and some h vertex candidates (by the above paragraph, this committee cannot contain any dummy candidates), and let r be the number of edges that connect the vertices corresponding to the vertex candidates from W . Let us now calculate the score of W . The filler voters provide score T 1 . The situation regarding the edge voters requires more care.
Each edge voter gets score at least g(sing(g) − 2) due to the edge-filler candidates. For each edge for which at least one endpoint is in W , we get additional g(sing(g) − 1) − g(sing(g) − 2) points, and for each edge whose both endpoints are in W , we get yet additional g(sing(g)) − g(sing(g) − 1) points. Thus, the edge voters give W the following score (see detailed explanations below): The first main term corresponds to the points all the edge voters receive, the second is the correction for edge voters that correspond to edges that have at least one endpoint in W (note that if for some edge both its endpoints belong to W , then we add g(sing(g) − 1) − g(sing(g) − 2) twice, once for each endpoint), and the final term corresponds to the correction for edges that have two endpoints in W . Let us now explain why this final correction is appropriate. Consider some edge voter for an edge whose both endpoints are in W . For this voter, we account g(sing(g) − 2) points that each edge voter gets, we account g(sing(g)−1)−g(sing(g)−2) points for each of the endpoints, and g(sing(g)) − g(sing(g) − 1) − 2(g(sing(g) − 1) − g(sing(g) − 2)) points of the final correction. Altogether, this sums up to: g(sing(g) − 2) + 2(g(sing(g) − 1) − g(sing(g) − 2)) + g(sing(g)) − g(sing(g) − 1) − 2(g(sing(g) − 1) − g(sing(g) − 2)) = g(sing(g)).
This means that, indeed, we compute the score of edge voters for edges whose both endpoints are in W correctly. The same holds for all the other edge voters (and follows directly from the above analysis). Finally, we note that the score of W that we obtain from the edge voters is maximized when r is maximized. The maximum value that r may have is h 2 , which happens if and only if the vertex candidates in W correspond to a clique. Then the score that the edge voters provide equals T 2 + T 3 + T 4 and the total score of the committee is T .
We conclude, that there exists a committee with score at least T if and only if the input graph contains a size-h clique.

B Proof of Theorem 13
Let θ be a positive real, 0 < θ < 1. For the sake of contradiction, let us assume that there is a polynomial-time algorithm A that, given an election E and committee size k, outputs a committee W such that, under NearlyBloc the score of W is at least a θ fraction of the score of the winning committee. Using A, we will derive a θ 2 -approximation algorithm for the DamkS problem.
Let I be an instance of the DamkS problem with a graph G and an integer K .
Our algorithm proceeds as follows. For each B, 1 ≤ B ≤ K , we form an election E B = (C B , V B ) where: 1. The set of candidates is C B = V (G) ∪ e∈E(G) D e , where for each e ∈ E(G), D e = {d e,1 , . . . d e,B−2 } is the set of dummy candidates needed for our construction. 2. The collection V B of voters is such that for each edge e = {u 1 , u 2 } ∈ E(G) we have exactly one voter with preference order of the form {u 1 , u 2 } D e · · · .
For each election E B , we run algorithm A to find a committee W B of size B. Each such committee W B generates an induced graph G B with the vertex set V (G) ∩ W B . We let G 0 be the trivial subgraph of G consisting of two vertices and their connecting edge (if G had no edges, then we could output a trivial optimal solution at this point).
Let us now argue that the above algorithm is a θ 2 -approximation algorithm for the DamkS problem. Let OPT be an optimal solution for I , with the densest subgraph G consisting of B vertices and X edges. By definition, G has density δ = X B . For each B let us consider two cases: Case 1: X ≤ B θ . In this case, the density of the optimal graph is at most equal to 1 θ . However, a trivial solution with two vertices connected with an edge has density equal to 1 2 . Thus, in this case this trivial solution is θ 2 -approximate. Case 2: X > B θ . In this case we know that there exists a size-B committee for election E B with score at least X . Indeed, the committee that consists of the vertices from G obtains one point for each edge from G and has score X . Thus A for E B and committee size B outputs a committee W with score at least θ X . Let U = W ∩ V (G) (that is, let U be the part of this committee that consists of the vertex candidates) and let D = W − U (that is, let D be the set of dummy candidates from W ). We observe that the graph induced by U has at least θ X − |D | edges. To see this, note that since each dummy candidate is ranked among top B positions by exactly one voter, removing a dummy candidate from the committee-in effect decreasing the committee size-decreases the total score by at most one. Thus the committee consisting only of candidates from U has score at least θ X − |D | and each of the points obtained by this committee comes from an edge between some members of U . The graph induced by U has density δ such that: where the last inequality follows from the assumption that B < θx. Indeed, note that: B(θ X − |D |) = θ X B − B|D | ≥ θ X B − θ X |D | = θ X · (B − |D |).
By our assumptions, one of these conditions must hold. This means that the graph induced by U is a θ -approximate solution for I .
Since in both cases we obtain at least θ 2 -approximate solutions, our algorithm is θ 2approximate. Since it is clear that it runs in polynomial time, the proof is complete.

C Proof of Theorem 16
Our algorithm is based on solving a mixed integer linear program (MILP) in FPT time with respect to the number of integral variables. The key trick is to use non-integral variables in such a way that in every optimal solution they have to take integral values [this technique was first used by Bredereck et al. (2015)].
Let k be the input committee size and E = (C, V ) be the input election, where C = {c 1 , . . . , c m } is the set of candidates, V = (v 1 , . . . , v n ) is the collection of voters.
We enumerate all the nonempty subsets of V as S 1 , . . . , S 2 n −1 . For each i ∈ [2 n −1], let T (S i ) denote the largest set of candidates that satisfies the following condition: Every voter in S i ranks each candidate from T (S i ) among the top k positions and no other voter ranks either of the candidates from T (S i ) among top k positions. Note that T (S 1 ), . . . , T (S 2 n ) is a partition of C. We illustrate this partition in the following example. b, c, d, e, f } and V = (v 1 , . . . , v 6 ), where the voters have the following preference orders (we set the committee size k = 3 and, thus, we list only top k positions for each vote):
We have the following sets: T ({v 3  Our algorithm forms a mixed integer linear program with the following variables. We have 2 n − 1 integer variables, z 1 , . . . z 2 n −1 , where, intuitively, each z i describes how many candidates from the set T (S i ) we take into the winning committee. For each i ∈ [n] we also have an integer variable x i , which describes how many candidates from the top k positions of the preference order of voter v i belongs to the winning committee. Finally, for each variable x i , we have rational variables x i, j , 0 ≤ x i, j ≤ 1, such that (intuitively) each x i, j is 1 if x i is at least j. We present our mixed integer linear program in Fig. 2. To solve this program, we invoke Lenstra's famous result in its variant for mixed integer programming (Lenstra 1983, Section 5).
Now it remains to argue that it indeed outputs a correct solution, that is, that the variables z 1 , . . . , z 2 n −1 describe a winning committee. If all the variables have the intended, intuitive values (as described in the preceding paragraph), then-with our maximize n i=1 k j=1 x i,j · (g m,k (j) − g m,k (j − 1)) subject to: Due to constraints (a) and (e), variables z 1 , . . . , z 2 n −1 certainly describe a possible committee of size k (from each set T (S i ) we take z i arbitrary candidates). Constraints (b) ensure the correct values of variables x 1 , . . . , x n . Finally, the maximization goal and constraints (c) ensure that each variable x i, j is 1 exactly if x i ≥ j and is 0 otherwise. This is so, because g m,k is concave. Thus, if for some values j and j with j < j it were the case that x i, j < 1 and x i, j > 0 then increasing x i, j and decreasing x i, j by the same amount [without breaking constraint (d)] would yield a higher value of the function to be maximized.