Approval-Based Shortlisting

Shortlisting is the task of reducing a long list of alternatives to a (smaller) set of best or most suitable alternatives. Shortlisting is often used in the nomination process of awards or in recommender systems to display featured objects. In this paper, we analyze shortlisting methods that are based on approval data, a common type of preferences. Furthermore, we assume that the size of the shortlist, i.e., the number of best or most suitable alternatives, is not fixed but determined by the shortlisting method. We axiomatically analyze established and new shortlisting methods and complement this analysis with an experimental evaluation based on synthetic and real-world data. Our results lead to recommendations which shortlisting methods to use, depending on the desired properties.


Introduction
Shortlisting is a task that arises in many different scenarios and applications: given a large set of alternatives, identify a smaller subset that consists of the best or most suitable alternatives.Prototypical examples of shortlisting are awards where a winner must be selected among a vast number of eligible candidates.In these cases, we often find a two-stage process.In a first shortlisting step, the large number of contestants (books, films, individuals, etc.) is reduced to a smaller number.In a second step, the remaining contestants can be evaluated more closely and one contestant in the smaller set is chosen to receive the award.
Both steps may involve a form of group decision making (voting), but can also consist of a one-person or even automatic decision.For example, the shortlist of the Booker Prize is selected by a small jury (The Man Booker Prize, 2018), whereas the shortlists of the Hugo Awards are compiled based on thousands of ballots (The Hugo Awards, 2019).Similarly, the Baseball Writers' Association of America selects the new entries into the Baseball Hall of Fame by voting.In that case, any candidate with at least 75% approval enters the hall of fame, without a second round.Another very common application of shortlisting is the selection of most the promising applicants for a position who will be invited for an interview (Bovens, 2016;Singh et al., 2010).Apart from these prototypical examples, shortlisting is also useful in many less obvious applications like the aggregation of expert opinions for example in the medical domain (Gangl et al., 2019) or in risk management and assessment (Tweeddale et al., 1992).Shortlisting can even be used in scenarios without agents in the traditional sense, for example if we consider features as voters to perform an initial screening of objects, i.e., a feature approves all objects that exhibit this feature (Faliszewski et al., 2020).
In this paper, we consider shortlisting as a form of collective decision making.We assume that a group of voters announce their preferences by specifying which alternatives they individually view worthy of being shortlisted, i.e., they file approval ballots.In practice, approval ballots are commonly used for shortlisting, because the high number of alternatives that necessitates shortlisting in the first place precludes the use of ranked ballots.Furthermore, we assume that the number of alternatives to be shortlisted is not fixed (but there might be a preferred number), as there are very few shortlisting scenarios where there is a strong justification for an exact size of the shortlist.Due to this assumption, we are not in the classical setting of multi-winner voting (Kilgour and Marshall, 2012;Faliszewski et al., 2017;Lackner and Skowron, 2020), where a fixed-size committee is selected, but in the more general setting of multi-winner voting with a variable number of winners (Kilgour, 2010(Kilgour, , 2016;;Faliszewski et al., 2020).
In real-world shortlisting tasks, there are two prevalent methods in use: Multi-winner Approval Voting (selecting the k alternatives with the highest approval score) and threshold rules (selecting all alternatives approved by more than a fixed percentage of voters).Further shortlisting methods have been proposed in the literature (Brams and Kilgour, 2012;Kilgour, 2016;Faliszewski et al., 2020).Despite the prevalence of shortlisting applications, there does not exist work on systematically choosing a suitable shortlisting method.Such a recommendation would have to consider both expected (average-case) behavior and guaranteed axiomatic properties, and neither have been studied previously specifically for shortlisting applications (cf.related work below).Our goal is to answer this need and provide principled recommendations for shortlisting rules, depending on the properties that are desirable in the specific shortlisting process.In more detail, the contributions of this paper are the following: • We define shortlisting as a voting scenario and specify minimal requirements for shortlisting methods (Section 2).Furthermore, we introduce five new shortlisting methods: First k-Gap, Largest Gap, Top-s-First-k-Gap, Max-Score-f -Threshold, and Size Priority (Section 3).
• We conduct an axiomatic analysis of shortlisting methods and by that identify essential differences between them.Furthermore, we axiomatically characterize Approval Voting, f -Threshold, and the new First k-Gap rule (Section 4).
• We present a connection between shortlisting and clustering algorithms, as used in machine learning.We show that First k-Gap and Largest Gap can be viewed as instantiations of linkage-based clustering algorithms (Section 5).
• In numerical simulations using synthetic data, we approach two essential difficulties of shortlisting processes: we analyze the effect of voters with imperfect (noisy) perception of the alternatives and the effect of biased voters.These simulations complement our axiomatic analysis by highlighting further properties of shortlisting methods and provide additional data points for recommending shortlisting methods (Section 6).
• In addition to synthetic data, we collected voting data from the Hugo Awards, which are annual awards for works in science-fiction.This data set is a real-world application of shortlisting and offers a challenging test-bed for shortlisting rules.Using this data set, we investigate the ability of different shortlisting rules to produce short shortlists without excluding the alternative that actually won the award (Section 6).
• An open-source implementation (Lackner and Maly, 2022) of all considered shortlisting rules and the numerical experiments is available, including the Hugo data set.
• The recommendations based on our findings are summarized in Section 7. In brief, our analysis leads to a recommendation of Top-s-First-k-Gap, f -Threshold , and Size Priority, depending on the general shortlisting goal and desired behavior.
A preliminary version of this work has appeared in the proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021) (Lackner and Maly, 2021).

Related Work
There are two recent papers that are particularly relevant for our work.Both Faliszewski et al. (2020) and Freeman et al. (2020) investigate multi-winner voting with a variable number of winners.In contrast to our paper, the main focus of (Faliszewski et al., 2020) lies on computational complexity, which is less of a concern for our shortlisting setting (as discussed later).The paper also contains numerical simulations related to the number of winners (which is one of the metrics we consider in our paper).In the few cases where shortlisting rules are considered1 , their results regarding the average size of winner sets agree with our simulations (Section 6).Freeman et al. (2020) study proportionality in shortlisting scenarios.A proportional representation of voters is incompatible with our desiderata of shortlisting rules (i.e., proportionality is incompatible with the Efficiency axiom, which we require for shortlisting rules).Thus, the rules and properties considered in (Freeman et al., 2020) do not intersect with ours and are difficult to compare with.A simplified separation between our work and theirs is the underlying assumption of fairness: we require that the most deserving candidates are included in the shortlist (fairness towards candidates), whereas proportionality is concerned with fairness towards voters.
There are two other notable voting frameworks with a variable number of winners.First, shortlisting rules can be viewed as a particular type of social dichotomy functions (Duddy et al., 2014;Brandl and Peters, 2019), i.e., voting rules which partition alternatives into two groups.Moreover, multiwinner voting with a variable number of winners can be seen as a special case of (binary) Judgment Aggregation (List, 2012;Endriss, 2016) without consistency constraints.However, both of these frameworks treat the set of selected winners and its complement as symmetric.This is in contrast to shortlisting, where we usually expect the winner set to be only a small minority of all available candidates.For this reason, social dichotomy functions and Judgment Aggregation rules are generally not well suited for shortlisting.
It is worth mentioning that shortlisting is is not only studied as a form of collective decision making but also studied as a model of individual decision making.Manzini and Mariotti (2007) proposed Rational Shortlisting Methods as a model of human choice, which lead to number of works on shortlisting as a decision procedure, for example (Dutta and Horan, 2015), (Horan, 2016), (Kops, 2018), and (Tyson, 2013).
More generally, there is a substantial literature on multi-winner voting with a fixed number of winners (i.e., committee size), as witnessed by recent surveys (Kilgour and Marshall, 2012;Faliszewski et al., 2017;Lackner and Skowron, 2020).Multi-winner voting rules are much better understood, both from an axiomatic (Elkind et al., 2017b;Fernández et al., 2017;Aziz et al., 2017a;Lackner and Skowron, 2021;Sánchez-Fernández and Fisteus, 2019) and experimental (Elkind et al., 2017a;Bredereck et al., 2019) point of view, also in the context of shortlisting (Aziz et al., 2017b;Bredereck et al., 2017).Results for multi-winner rules, however, typically do not easily translate to the setting with a variable number of winners.

The Formal Model
In this section we describe our formal model that embeds shortlisting in a voting framework.The model consists of two parts: a general framework for approval-based elections with a variable number of winners (Kilgour, 2010(Kilgour, , 2016;;Faliszewski et al., 2020) on the one hand and, on the other hand, four basic axioms that we consider essential prerequisites for shortlisting rules.
An approval-based election E = (C, V ) consists of a non-empty set of candidates (or alternatives) 2 C = {c 1 , . . ., c m } and an n-tuple of approval ballots V = (v 1 , . . ., v n ) where v i ⊆ C. If c j ∈ v i , we say that voter i approves candidate c j ; if c j ∈ v i , voter i does not approve candidate c j .We interpret a voter's approval of a candidate as the preference for this candidate being included in the shortlist.In the following we will always write n E for the number of voters and m E for the number candidates in an election E. We will omit the subscript if E is clear from the context.
The approval score sc E (c j ) of candidate c j in election E is the number of approvals of c j in V , i.e., sc E (c j ) = |{i : 1 ≤ i ≤ n and c j ∈ v i }|.We write sc(E) for the vector (sc E (c 1 ), . . ., sc E (c m )).To avoid unnecessary case distinctions, we only consider nondegenerate elections: these are elections where not all candidates have the same approval score.
An approval-based variable multi-winner rule (which we refer to just as "voting rule") is a function mapping an election E = (C, V ) to a subset of C. Given a rule R and an election E, R(E) ⊆ C is the winner set according to voting rule R, i.e., R(E) is the set of candidates which have been shortlisted.Note that R(E) may be empty or contain all candidates.We refer to candidates in R(E) as winners or winning candidates.
Now we introduce the basic axioms that we require every shortlisting rule to satisfy.Anonymity and Neutrality are two basic fairness axioms for voting rules (Zwicker and Moulin, 2016).
Axiom 1 (Anonymity).All voters are treated equal, i.e., for every permutation π : {1, . . ., n} → {1, . . ., n} and election Axiom 2 (Neutrality).All candidates are treated equally, i.e., for every election E = (C, V ) where V = (v 1 , . . ., v n ) and permutation π : Shortlisting differs from other multi-winner scenarios in that we are not interested in representative or proportional committees.Instead, the goal is to select the most excellent candidates.This goal is formalized in the following axiom.
Axiom 3 (Efficiency).No winner set can have a strictly smaller approval score than a nonwinner, i.e., for all elections E = (C, V ) and all candidates The assumption that approval scores are approximate measures of the general quality of candidates can also be argued in a probabilistic framework: under reasonable assumptions a set of candidates with the highest approval scores coincides with the maximum likelihood estimate of the truly best candidates (Procaccia and Shah, 2015).Thus, we impose Efficiency to guarantee the inclusion of the most-likely best candidates.
Efficiency can also be argued for from the perspective of voters: Let R satisfy Efficiency and W = R(E) for some election E. Then we claim that there does not exist a set W with In this sense, efficient shortlists are Pareto efficient among shortlists of the same size.
It is also worth noting that Efficiency rules out proportional voting rules.It is easy to see why: a proportional selection of winner sets has to contain candidates supported by (sufficiently sized) minorities.As Efficiency demands that majority candidates are always to be preferred, any sensible notion of proportionality clashes with Efficiency.
The last of our basic axioms is Non-tiebreaking.Since the number of winners is variable in our setting, there is generally no need to break ties.Because tiebreaking is usually an arbitrary and unfair process, voting rules should not introduce unnecessary tiebreaking.This idea yields our fourth axiom: Axiom 4 (Non-tiebreaking).If two candidates have the same approval score, either both or neither should be winners.That is, for all elections E = (C, V ) and all candidates c i and We postulate these four axioms as the minimal requirements for a voting rule to be considered a shortlisting rule in our sense.
Definition 1.An approval-based variable multi-winner rule is a shortlisting rule if it satisfies Anonymity, Neutrality, Efficiency and is non-tiebreaking.
Observe that Non-tiebreaking and Efficiency are axioms that are only interesting if we consider voting with a variable number of winners.Clearly, no voting rule for voting with a fixed number of winners can be non-tiebreaking.Furthermore, except for the issue of how to break ties, there is exactly one voting rule for approval voting with a fixed number k of winners that satisfies Efficiency, namely picking the k candidates with maximum approval score (Multi-winner Approval Voting).
A consequence of Efficiency and Non-tiebreaking is that a shortlisting rule only has to decide how many winners there should be.This reduces the complexity of finding the winner set drastically as there are only linearly many possible winner sets, in contrast to the exponentially many subsets of C.
Observation 1.For every election there are at most m + 1 sets that can be winner sets under a shortlisting rule.

Shortlisting Rules
In the following, we define the shortlisting rules that we study in this paper.We define these rules by specifying which properties a candidate has to satisfy to be contained in the winner sets.As before, let E = (C, V ) be an election.We assume additionally that c 1 , . . ., c m is an enumeration of the candidates in descending order of approval score, i.e., such that sc E (c i−1 ) ≥ sc E (c i ) for all 2 ≤ i ≤ m.We will illustrate all rules on the following example: Example 1.Let E = (C, V ) be an election with 10 voters and 8 candidates c 1 , . . ., c 8 .The scores are given by sc(E) = (10,10,9,8,6,3,3,0).This instance is illustrated in Figure 1.There are seven possible winner sets for a shortlisting rule:

Established Rules
First we introduce the shortlisting rules that are either commonly used in practice or have been proposed in the literature before.A natural idea is to select all most-approved candidates.The corresponding winner set equals the set of co-winners of classical Approval Voting (Brams and Fishburn, 1978).
The winners under Approval Voting in Example 1 are c 1 and c 2 as they both have the highest score.
Another natural way to determine the winner set is to fix some percentage threshold and declaring all alternatives to be winners that surpass this approval threshold (Kilgour, 2010).For example, for a baseball player to be entered into the Hall of Fame, more than 75% of the members of the Baseball Writers' Association of America have to approve this nomination (BWAA, 2019).Such rules are known as quota rules in judgment aggregation (Endriss, 2016).
Then an alternative is a winner if it is approved by more than 50% of all voters.In Example 1 this would mean that the winner set contains all candidates with 6 or more approvals, i.e., c 1 , . . ., c 5 .
A sensible modification of f -Threshold would be to select all alternatives with an above-average approval score, i.e., the set of winners consists of all alternatives c with sc E (c) > 1 m • c ∈C sc E (c ).This rule is also a shortlisting rule in our sense.However, as it will, in expectation, select half of the available candidates, we do not think that it is a reasonable rule in most shortlisting settings.Therefore, we do not study it and only mention that it might be a good rule in other voting contexts with a variable number of winners.For example, Duddy et al. (Duddy et al., 2016) analyzed this rule and concluded that it is the best rule for partitioning alternatives into homogeneous groups (see also the axiomatic characterization of this rule in (Brandl and Peters, 2019)).
Another natural modification is to base the threshold not on the number of voters but on the highest approval score achieved by a candidate.We call this Max-Score-f -Threshold.This variant of f -Threshold turns out to be well suited to shortlisting as it formalizes the goal of selecting all candidates that are close to the top.
We observe that c 1 and c 2 in Example 1 have score n, hence f -Threshold and Max-Score-f -Threshold coincide on the example.
The next three rules are further shortlisting methods that have been proposed in the literature.First Majority (Kilgour, 2016) includes as many alternatives as necessary to comprise more than half of all approvals.The following definition deviates slightly from the original definition (Kilgour, 2016) in that it is non-tiebreaking.
Rule 4 (First Majority).Let i be the smallest index such that j≤i sc E (c j ) > j>i sc E (c j ).Then c ∈ R(E) if and only if sc E (c) ≥ sc E (c i ).The candidates in Example 1 together have 49 approvals.Therefore, a shortlist needs at least 25 approvals to be the First Majority winner set.The smallest shortlist to achieve at least 25 approvals is {c 1 , c 2 , c 3 } with 29 approvals.
Next-k (Brams and Kilgour, 2012) is a rule that includes alternatives starting with the highest approval score, until a major drop in the approval scores is encountered, more precisely, if the total approval score of the next k alternatives is less than the score of the previous alternative.
Rule 5 (Next-k).Let k be a positive integer.Then, c i ∈ R(E) if for all i < i it holds that sc E (c i ) ≤ k j=1 sc E (c i +j ), where sc E (c i +j ) = 0 if i + j > m.Consider Next-2.Then it is easy to check that, in Example 1, for all i ≤ 5 the score of c i is smaller or equal the sum of the scores of the next two candidates.For example sc E (c 1 ) = 10 ≤ 19 = sc E (c 2 ) + sc E (c 3 ).On the other hand sc E (c 7 ) = 3 ≤ 0 = sc E (c 8 ) + 0. Therefore, the winner set under Next-2 is {c 1 , . . ., c 7 }.
Observe that for both Next-k and First Majority the winner set does not depend on the chosen enumeration of alternatives.This will hold for all voting rules introduced in this paper.Faliszewski et al. (2020) discuss several other rules that satisfy our basic axioms called Capped Satisfaction Approval Voting (CSA), Net Approval Voting (NAV) and Net Capped Satisfaction Approval Voting (NCSA) which were originally proposed by Brams and Kilgour (2012) and Brams and Kilgour (2015) as well as generalizations of these rules.Among these Faliszewski et al. (2020) conclude that only the following generalization of NCSA is practical.
Rule 6 (q-NCSA).Let q ∈ [0, 1] be a real number and S ⊆ C a set of candidates.Then we define the q-NCSA-score of S as: The winner set then is the largest set with a maximum q-NCSA-score.3 It is not immediately obvious that q-NCSA is a shortlisting rule in our sense.The following proposition shows that this is indeed the case and establishes some further key properties of q-NCSA.
Proposition 1.The q-NCSA rule has the following properties for all q ∈ [0, 1].

It holds that sc
2. q-NCSA is a shortlisting method.
3. q-NCSA can be computed in polynomial time.
Proof.We prove the three statements separately: 1. Let 1 be the indicator function.Then we can write the q-NCSA-score also as follows: 2. It is clear that q-NCSA satisfies Anonymity and Neutrality.Consider Efficiency: Assume there are two candidates c i and c j such that sc E (c i ) < sc E (c j ), c i ∈ R(E) and c j ∈ R(E).Then there must be a S ⊆ C with c i ∈ S and c j ∈ S which has maximal q-NCSA-score.However, by definition the q-NCSA-score of (S \{c 1 })∪{c j } is higher than the q-NCSA-score of S. A contradiction.
The non-tiebreaking property follows from the following claim: By Efficiency, the largest set with maximum q-NCSA-score is of the form {c 1 , . . ., c i } for some c i .As the set has maximum q-NCSA-score, it holds in particular that However, this is a contradiction to the assumption that {c 1 , . . ., c i } was the largest set with maximal q-NCSA-score.The proof of Claim 1 contains a lengthy calculation and can be found in the appendix.
3. As we have shown that q-NCSAis a shortlisting rules, we know that we only need to consider sets that are efficient and non-tiebreaking.Further, we can clearly compute the q-NCSA-score of a set in polynomial time.As there are at most linearly many potential winner sets (Observation 1), finding the one with maximum q-NCSA-score can be done in polynomial time.
Consider again Example 1.Then, the 0.5-NCSA-score of the shortlist {c It can be checked that this is the unique maximal 0.5-NCSA-score and hence {c 1 , . . ., c 4 } is the winner set under 0.5-NCSA.
Observation 2. An important feature (and downside) of q-NCSA is that candidates with an approval score of less than n /2 can only decrease sc q-NCSA E (S).Consequently, q-NCSA returns the empty in elections where all candidates have few approvals.

New Shortlisting Rules
Let us now introduce some new shortlisting rules.Similarly to Next-k, the next two rules are based on the idea that one wants to make the cut between winners and non-winners in a place where there is a large gap in the approval scores.This can either be the overall largest gap or the first sufficiently large gap.
Rule 7 (Largest Gap).Let i be the smallest index such that sc Note that in this definition a smallest index is guaranteed to exist due to our assumption that profiles are non-degenerate.In Example 1 the two largest gaps are between c 5 and c 6 and c 7 and c 8 , both of size 3.As we pick the smaller index, the winner set is {c 1 , . . ., c 5 }.
, every alternative is a winner.
Let us consider First 2-Gap in Example 1.The gaps between c 1 and c 2 , c 2 and c 3 as well as between c 3 and c 4 are smaller than two, while the gap between c 4 and c 5 is 2. Therefore the winner set is {c 1 , c 2 , c 3 , c 4 }.
The parameter k has to capture what it means in a given shortlisting scenario that there is a sufficiently large gap between alternatives, which in particular depends on the number of voters |V |.If no further information is available, one can choose k by a simple probabilistic argument.Assume, for example, alternative c's approval score is binomially distributed sc E (c) ∼ B(n, q c ), where n is the number of voters and q c can be seen as c's quality.We choose k such that the probability of events of the following type are smaller than a selected threshold α: two alternatives a and b have the same objective quality (q a = q b ) but have a difference in their approval scores of k or more.In such a case, the First k-Gap rule might choose one alternative and not the other even though they are equally qualified, which is an undesirable outcome.For example, if n = 100 and we want α = 0.5, we have to choose k ≥ 5 and if we want α = 0.1 we need k ≥ 12.Note that this argument leads to rather large k-values; if further assumptions about the distribution of voters can be made, smaller k-values are feasible.
The voting rules above output winner sets of very different sizes (as we will see in the experimental evaluation, Section 6).It is a common case, however, that there is a preferred size for the winner set, but this size can be varied in order to avoid tiebreaking.This flexibility is especially crucial if the electorate is small and ties are more frequent.Based on real-world shortlisting processes, we propose a rule that deals with this scenario by accepting a preference order over set sizes as parameter and selecting a winner set with the most preferred size that does not require tiebreaking.
Consider for example a strict total order of the form 1 6 0 . . . .Then the set of Size Priority winners under in Example 1 is the empty set, because {c 1 } and {c 1 , . . ., c 6 } break ties, as sc Size Priority is a non-tiebreaking analogue of Multi-winner Approval Voting, which selects the k alternatives with the highest approval score.A specific instance of Size Priority was used by the Hugo Award prior to 2017 with the priority order 5 6 7 . . .(The Hugo Awards, 2019).Generally, the choice of a priority order depends on the situation at hand.For award-shortlisting, typically a small number of alternatives is selected (the Booker Prize, e.g., has a shortlist of size 6).In a much more principled fashion, Amegashie (1999) argues that the optimal size of the winner set for shortlisting should be proportional to √ m, i.e., the square root of the number of alternatives.In practice, the most common priority order is k k + 1 • • • m for some k < m, i.e., the smallest non-tiebreaking shortlist that contains at least k alternatives is selected.Another important special case are instances of Size Priority that rank 0 and m the lowest, i.e., that are decisive whenever possible.Therefore, we give Size Priority rules with based on such priority orders a special name.
Definition 2. Let be a strict total order on 0, . . ., m and let k be a positive integer with k ≤ m such that k k + 1 • • • m and m for all < k.Then, the Size Priority rule defined by the priority order is an Increasing Size Priority rule.We will write ISP-k as a short form for the Increasing Size Priority rule with k k + 1 . . .as priority order.
Let be a strict total order on 0, . . ., m such that k m and k 0 holds for all 0 < k < m.Then, the Size Priority rule defined by the priority order is a Decisive Size Priority rule.
Other special cases of Size Priority could be defined in a similar way, for example Decreasing Size Priority.However, Increasing Size Priority and Decisive Size Priority are the most natural and common types of Size Priority and additionally satisfies better axiomatic properties than, e.g., Decreasing Size Priority.
Finally, we propose a rule that combines the ideas behind First k-Gap and Size Priority.In practice, we often want to have a large gap between winners and non-winners, but not at any price in terms of the size of the shortlist.
Rule 10 (Top-s-First-k-gap).Let W be the winner set for First k-Gap and W the winner set for the Increasing Size Priority instance defined by the order s s + 1 . . .
Let us now consider the relationships between the proposed rules.
Proposition 2. We observe the following relations between the considered voting rules: • First k-Gap and Next-k are equivalent to Approval Voting for k = 1.
• ISP-1 is equivalent to Approval Voting.
• Top-s-First-k-Gap is equivalent to First k-Gap for s = m and it is equivalent to Increasing Size Priority for k = m.
Proof.First observe that First-1-Gap and Next-1 select all candidates which have maximal score.Now let c i be the first candidate which has less than the maximal score.
Then sc We finally observe that q-NCSA for q = 1 is a mix of Approval Voting and f -Threshold and for q = 0 is closely related to f -Threshold for f (n) = 1 2 n.First consider q = 1.If any candidate is approved by more than 50% of the voters then 1-NCSA is equivalent to Approval Voting, as the 1-NCSA-score equals the average net-approval of the candidates in the set.This score is maximized by any set only containing candidates with maximal approval.On the other hand, if no candidate has more than 50% approvals then no set has positive q-NCSA-score.Therefore, the empty set is the smallest set with maximal q-NCSA-score.
Now consider q = 0. We observe that then q-NCSA-score of a set S ⊆ C is the sum of the net-approval of the candidates, where the net approval of a candidate c is sc E (c) − (n − sc E (c)).Hence the 0-NCSA-score is maximized by every set that contains all candidates with positive net approval and an arbitrary number of candidates with 0 net approval.A candidate has non-negative net approval if and only if 2sc E (c) − n ≥ 0 which is equivalent to sc E (c) ≥ n 2 .4To conclude the section, let us remark that all of the above rules can be computed in polynomial time.This follows immediately from their respective definitions.For q-NCSA, we made the argument explicit in Proposition 1.

Axiomatic Analysis
In this section, we axiomatically analyze shortlisting rules with the goal to discern their defining properties.First, we consider axioms that are motivated by the specific requirements of shortlisting, then we study well-known axioms that describe more generally desirable properties of voting rules.For an overview, see Table 1.

Approval Voting
Table 1: Results of the axiomatic analysis.

-Stability, Unanimity, and Anti-Unanimity
When shortlisting is used for the initial screening of candidates, for example for an award or a job interview, then we cannot assume that the voters have perfect judgment.Otherwise, there would be no need for a second round of deliberation, as we could just choose the highest-scoring alternative as a winner.Therefore, small differences in approval may not correctly reflect which alternative is more deserving of a spot on the shortlist.Thus, out of fairness, we want our voting rule to treat alternatives differently only if there is a significant difference in approval between them.
Axiom 5 ( -Stability).If the approval scores of two alternatives differ by less than , either both or neither should be a winner, i.e., for every election E = (C, V ) and candidates c i and Here, the parameter has to capture what constitutes a significant difference in a given election.This will depend, for example, on the number and trustworthiness of the voters.Also, observe that 1-Stability equals Non-tiebreaking.Now, while a small difference in approvals might not correctly reflect the relative quality of the candidates, we generally assume in shortlisting that the approval scores approximate the underlying quality of alternatives 5 .Therefore, at a minimum, we want to include alternatives that are approved by everyone and exclude alternatives that are approved by no one.
Axiom 6 (Unanimity).If an alternative is approved by everyone, it must be a winner, i.e., for every election Axiom 7 (Anti-Unanimity).If an alternative is approved by no one, it cannot win, i.e., for every election Unfortunately, it turns out that these three axioms are incompatible unless there are many more voters than alternatives.Indeed Unanimity, Anti-Unanimity and -Stability can be jointly satisfied if and only if Theorem 3.For every there is a shortlisting rule that satisfies Unanimity, Anti-Unanimity and -Stability for every election E such that n E > ( − 1) • (m E − 1).This is a tight bound in the following sense: For every > 1, there is an election E such that n E = ( − 1) • (m E − 1) and no shortlisting rule can satisfy Unanimity, Anti-Unanimity and -Stability on E.
Proof.To show that Unanimity, Anti-Unanimity and -Stability are jointly satisfiable if n E > ( −1)•(m E −1), we will show that a slightly modified version of First k-Gap satisfies all three axioms for elections E with n E > ( − 1) • (m E − 1).We define Modified First -Gap as follows: Let c 1 , . . ., c m be an enumeration of If no such index exists, then R(E) = ∅ if there is an alternative c with sc E (c) = 0, and R(E) = C otherwise.Clearly, this rule still satisfies -Stability.Now, let E be an election such that there is an alternative c with sc E (c) = n.Assume first that there is no alternative c with sc E (c ) = 0.In that case, Modified First -Gap vacuously satisfies Anti-Unanimity and, by definition, also Unanimity.Now assume that there is an alternative c with sc E (c) = 0. We claim that there is an index i such that sc E (c i ) − sc E (c i+1 ) ≥ and hence only alternatives c such that sc E (c) ≥ sc E (c i ) > − 1 are winners.Otherwise, we have sc E (c i+1 ) ≥ sc E (c i ) − ( − 1) for all i < m and hence sc E (c m ) ≥ sc E (c 1 ) − ( − 1) • (m − 1).However, as sc E (c 1 ) = n > ( − 1) • (m − 1) this contradicts the assumption that there is an alternative c with sc E (c) = 0, i.e., sc E (c m ) = 0.
Finally, let E be an election such that there is no alternative c with sc E (c) = n.Then, Modified First -Gap vacuously satisfies Unanimity.Now, if there is an alternative c with sc E (c ) = 0 then we have to distinguish two cases.If there is no -gap, then R(E) = ∅ by definition and hence Modified First -Gap satisfies Anti-Unanimity.On the other hand, if there is a -gap, then only alternatives above the -gap are selected, which must have a score of or larger.Hence, Anti-Unanimity is also satisfied.Now we show the tightness of the theorem.Let E be an election with 2 alternatives and − 1 voters such that sc(E) = ( − 1, 0).We observe n E = − 1 = ( − 1) • (2 − 1).We claim that no R satisfy Unanimity, Anti-Unanimity and -Stability on E. Hence, c 1 ∈ R(E) must hold by Unanimity.Then sc E (c 1 ) − sc E (c 2 ) < implies c 2 ∈ R(E) by -Stability, contradicting Anti-Unanimity.
Theorem 3 tells us that -Stability requires some sacrifices as it is incompatible with the combination of Unanimity and Anti-Unanimity.However, First k-Gap can be seen as an optimal compromise as, with a small modification, it satisfies Anti-Unanimity whenever Theorem 3 allows it.
Let us now analyze the considered shortlisting rules with regard to the three axioms Unanimity, Anti-Unanimity and -Stability.
• It is straightforward to see that Approval Voting, f -Threshold, Max-Score-f -Threshold and Largest Gap satisfy Unanimity and Anti-Unanimity for all nondegenerate profiles.Hence, they cannot satisfy -Stability for > 1.
• By definition, First k-Gap satisfies Unanimity and -Stability for k ≥ for all elections.Therefore, it cannot satisfy Anti-Unanimity.
• First Majority satisfies Unanimity as c 1 ∈ R(E) by definition and sc . Furthermore, we claim that it satisfies Anti-Unanimity.Let c 1 , . . ., c m be the enumeration of the candidates used by First Majority and let c i be the first candidate with sc E (c i ) = 0.Then, It follows that First Majority does not satisfy -Stability for > 1.
• Next-k satisfies Unanimity by definition.Furthermore, we claim that it satisfies Anti-Unanimity.Let c 1 , . . ., c m be the enumeration of the candidates used by Nextk and let c i be the first candidate with sc E (c i ) = 0.Then, sc E (c i−1 ) > 0 while k j=1 sc E (c (i−1)+j ) = 0.This implies that c i ∈ R(E).As before, it follows that Next-k does not satisfy -Stability for > 1.
• q-NCSA satisfies Unanimity and Anti-Unanimity.First, we show that q-NCSA satisfies Unanimity: Let c i be a candidate with sc E (c i ) = n.Then, sc q-NCSA E ({c i }) > 0 = sc q-NCSA E (∅) and therefore R(E) = ∅.As sc E (c i ) = max c∈C (sc E (c)) this implies by Efficiency and Non-tiebreaking that c i ∈ R(E).Now, we show that q-NCSA satisfies Anti-Unanimity: Let c i be a candidate with sc E (c i ) = 0. Then 2sc E (c i ) − n < 0, which means for every set S such that c i ∈ S that the q-NCSA-score of S is strictly smaller than the q-NCSA-score of S \ {c i }.This means that the q-NCSAscore of S is not maximal.As S was chosen arbitrarily, we can conclude c i ∈ R(E).As above, it follows that Next-k does not satisfy -Stability for > 1.
• Now, we claim that Size Priority always satisfies either Unanimity or Anti-Unanimity: First, we show that it satisfies Unanimity if m 0: As selecting all m candidates can never be tie-breaking, Size Priority will never select the empty set in this case.This implies c 1 ∈ R(E) for all E. As Size Priority is non-tiebreaking it follows that all c i with sc E (c i ) = n must be in the winning shortlist.On the other hand, Size Priority satisfies Anti-Unanimity if 0 m holds by a symmetric argument.
Moreover, we claim that it satisfies both axioms (for non-degenerate profiles) if and only if it is a Decisive Size Priority rule.First let R be a Decisive Size Priority rule.Then, for every non-degenerate profile there must be a i such that c 1 , . . ., c i can be selected as winners without tiebreaking.As R is a Decisive Size Priority rule, we know i m and i 0. It follows that R(E) is neither C nor ∅.By a similar argument as before, this implies that R satisfies Unanimity and Anti-Unanimity.
Now assume that R is a Size Priority instance that is not determined.Then there exists a 0 < k < m such that either m k or 0 k.Consider an election E such that sc E (c i ) = n for all i ≤ k and sc E (c j ) = 0 if j > k.Then, in the first case all candidates are winners and hence Anti-Unanimity is violated and in the second case no candidate is a winner and hence Unanimity is violated.
It follows from the above that Increasing Size Priority satisfies Unanimity but not Anti-Unanimity.Finally, Size Priority, by definition, satisfies -Stability for > 1 if and only if 0 or m is the most preferred size.
• Finally, Top-s-First-k-Gap satisfies Unanimity because First k-Gap and Increasing Size Priority do so.That means for all election both winner sets W and W considered by Top-s-First-k-Gap satisfy unanimity.Hence, whatever set is chosen, unanimity is satisfied.On the other hand, it satisfies neither -Stability for > 1 nor Anti-Unanimity.Consider first the election given by sc(E) = (3, 2, 0, 0) and Top-1-First-2-Gap.Then {c 1 } is the winner set and hence 2-Stability is violated.Now consider the same election under Top-3-First-3-Gap.Then both W and W equal {c 1 , . . ., c 4 } and hence Anti-Unanimity is violated.
We observe that First k-Gap is the only voting rule considered in this paper that satisfies -Stability for > 1: However, it is worth noting that Largest Gap satisfies -Stability whenever there is an -gap.

Minimal Voting Rules
The goal of shortlisting is to reduce a set of alternatives to a more manageable set of alternatives.It is therefore desirable that shortlisting rules produce short shortlists, without compromising on quality.To formalize this desideratum we define the concept of a minimal voting rule that satisfies a set of axioms.
Definition 3. Let A be a set of axioms and let S(A) be the set of all voting rules satisfying all axioms in A. Then, we say a voting rule is a minimal voting rule R for A if for all elections E it holds that R(E) = R * ∈S(A) R * (E).
We observe that in general a minimal voting rule R for a set of axioms A does not satisfy all axioms in A. Consider, e.g., the following axiom: Axiom 8 (Determined).Every election must have at least one winner, i.e., for all elections E we have R(E) = ∅.
First, observe that besides f -Threshold , q-NCSA and Size Priority all voting rules considered in this paper are determined by definition.For f -Threshold it is clear that R(E) can be empty if no candidate achieves enough approvals to clear the threshold.Observe that this is not the case for Max-Score-f -Threshold, as we assume f (n) < n and hence candidates with maximal score are always winners.Size Priority returns the empty set if 0 is the most preferred set size that does not require tiebreaking.This cannot happen if Size Priority is either a Decisive Size Priority rule or m 0; Size Priority is determined in these cases.In particular this means that Increasing Size Priority is determined.For q-NCSA we observe that if no candidate has at least 50% approvals then 2sc E (c) − n is negative for all candidates and hence the q-NCSA-score is only maximized by the empty set.Hence q-NCSA is not determined.Now, let us consider arbitrary voting rules with a variable number of winners, i.e., not only shortlisting rules.Then for every c ∈ C the rule R c that always outputs the set {c} is a determined voting rule.It follows that the minimal determined voting rule always outputs the empty set and is hence not determined.In contrast, for shortlisting rules the following holds.
Proposition 4. Let A be a set of axioms that contains the four basic shortlisting axioms (Axioms 1-4).Then the minimal voting rule for A is again a shortlisting rule, i.e., it satisfies Axioms 1-4.
Proof.Let A be a set of axioms and let R be the minimal voting rule for A. It is straightforward to see that R satisfies Neutrality and Anonymity.We show that R also satisfies Efficiency and is non-tiebreaking.Let E be an election.As every rule in S(A) is a shortlisting rule, there is a k R * ∈ {0, . . ., m} for every rule R * ∈ S(A) such that R * (E) = {c 1 , . . ., c k R * }.Now let k m be the smallest k such that there is a rule R * ∈ S(A) with R * (E) = {c 1 , . . ., c k }.Then, by definition R(E) = R * (E).As R * (E) does not violate Efficiency and non-tiebreaking for E, neither does R.As this argument holds for arbitrary elections, R satisfies Efficiency and is non-tiebreaking.
As the voting rule that always outputs the empty set is a shortlisting rule, it is also the minimal shortlisting rule (without additional axioms).Therefore, we need to assume additional axioms.We consider determined and -stable shortlisting rules.
Theorem 5. Approval Voting is the minimal voting rule that is efficient, non-tiebreaking and determined.Furthermore, for every positive integer k, First k-Gap is the minimal voting rule that is efficient, k-stable and determined.
Proof.Let A be the set {Efficiency, k-Stability, Determined} and R be First k-Gap.We know that First k-Gap is efficient, k-stable and determined, therefore we know Now, every determined voting rule must have a non-empty set of winners.If the voting rule is efficient, the set of winners must contain at least one top ranked alternative.Now, consider an enumeration of the alternatives c 1 , . . ., c m such that sc E (c j ) ≥ sc E (c j+1 ) holds for all j.If a voting rule is k-stable, a winner set containing one top ranked alternative must contain all alternatives c i for which sc E (c j ) < sc E (c j+1 ) + k holds for all j < i.By the definition of First k-Gap this implies R(E) ⊆ R * ∈S(A) R * (E).
The minimality of Approval Voting is a special case of the minimality of First k-Gap, as 1-Stability equals Non-tiebreaking and First-1-Gap is equivalent to Approval Voting.This result is another strong indication that First k-Gap is promising from an axiomatic standpoint.It produces shortlists that are as short as possible without violating k-Stability, an axiom that is desirable in many shortlisting scenarios.
Next we will consider axioms that are not specific to shortlisting, but often appear in the voting and judgment aggregation literature to characterize "well behaved" aggregation techniques.

Independence
-Stability formalizes the idea that the length of a shortlist should take the magnitude of difference between approval scores into account.This contradicts an idea that is often considered in judgment aggregation, namely that all alternatives should be treated independently (Endriss, 2016).
Axiom 9 (Independence).If an alternative is approved by exactly the same voters in two elections then it must be a winner either in both or in neither.That is, for an alternative c, and two elections E = (C, V ) and f -Threshold rules are the only rules in our paper satisfying Independence.Indeed, Independence characterizes f -Threshold rules.
Theorem 6.Given a fixed set of alternatives C, every shortlisting rule that satisfies Independence is an f -Threshold rule for some function f .Proof.Let R be a voting rule that satisfies Anonymity and Independence.Then we claim that for two elections E = (C, V ) and . Then, by Anonymity, c i ∈ R(E) if and only if c i ∈ R(E ).Now, as c i is approved by the same voters in E and E * , Independence implies c i ∈ R(E ) if and only if c i ∈ R(E * ).Now, let E = (C, V ) and E * = (C, V * ) be two elections with |V | = |V * |.Furthermore, assume c i ∈ R(E) and sc E (c i ) < sc E * (c i ).We claim that this implies c i ∈ R(E * ).By Independence, we can assume w.l.o.g. that there is an alternative c j such that sc E (c j ) = sc E * (c i ).Then, by Efficiency, c j ∈ R(E * ).Now, let E be the same election as E but with c i and c j switched.Then by Neutrality we have c i ∈ R(E ).As by definition sc E (c i ) = sc E * (c i ) this implies c i ∈ R(E * ) by Anonymity and Independence.The two arguments above mean that for every alternative c i and n ∈ N there is a k such that for all elections E = (C, V ) with |V | = n we know c i ∈ R(E) if and only if sc E (c i ) ≥ k.If R also satisfies Neutrality, then k must be the same for every c i ∈ C and hence R must be a Threshold rule.
In light of Theorem 6, Independence seems to be a very strong requirement, therefore we also consider the axiom Independence of Losing Alternatives which can be seen as a weakening of Independence.It states that removing a non-winning alternative cannot change the outcome of an election.

Axiom 10. (Independence of Losing Alternatives
Clearly, f -Threshold satisfies this axiom as it also satisfies Independence.As removing a losing alternative does not change the maximal score, the same holds for Max-Scoref -Threshold.Furthermore, as the removal of a losing alternative can only widen the gap between the winners and the non-winners, First k-Gap satisfies Independence of Losing Alternatives, and so does Approval Voting, which is a special case of First k-Gap.Finally, for q-NCSA the removal of a losing alternative just removes some non-maximal sets from consideration.Clearly, this does not change which sets have maximal q-NCSA-score.
None of the other rules satisfy Independence of Losing Alternatives.
• First Majority: Assume E is an election such that sc(E) = (3, 2, 1, 0).Then the winner set under First Majority is {c 1 , c 2 } but removing c 3 changes the winner set to {c 1 }.
• Largest Gap: Consider the same election as for First Majority.Then, the winner set under Largest Gap is {c 1 } but removing c 3 changes this to {c 1 , c 2 }.
For Size Priority we encounter a difficulty: Independence of Losing Alternatives cannot be applied to Size Priority because each instance of Size Priority is defined by a linear order on 0, . . ., m and decreasing the number of alternatives necessitates a different order.We can deal with this problem by defining classes of Size Priority instances: Definition 4. Let be a linear order on N. Then the class of Size Priority instances defined by contains for every number of alternatives m the Size Priority instance given by the restriction of to {0, 1, . . ., m}.
We say that the class of Size Priority instances defined by is a class of Increasing Size Priority instances if every Size Priority instance in the class is an Increasing Size Priority instance.This definition allows us to ask whether classes of Size Priority instances (defined by ) satisfies Independence of Losing Alternatives.Consider, e.g., the class of Size Priority instances defined by any order of the form 2 1 . . .and an election E with sc(E) = (2, 1, 1).Then R(E) = {c 1 } but the removal of c 3 leads to R(E) = {c 1 , c 2 }.Thus, Size Priority fails Independence of Losing Alternatives in general.However, we claim that every class of Increasing Size Priority instances satisfies Independence of Losing Alternatives.We distinguish two cases: First assume all m-alternatives are selected.Then Independence of Losing Alternatives is vacuously satisfied as there are no losing alternatives.On the other hand, assume that there is a k < m such that {c 1 , . . ., c k } is winning.As R is an Increasing Size Priority instance there is a k ≤ k such that restricted to {1, . . ., m} starts with k k + 1 • • • k.As k < m the same holds for restricted to {1, . . ., m − 1}.Hence if we remove an alternative c k * with k * > k the winner set does not change.
Finally, we claim that Top-s-First-k-Gap also satisfies Independence of Losing Alternatives.Assume first that the set of winners W under First k-Gap is smaller than s.After removing a losing alternative, the set of winners under First k-Gap remains the same and is hence still smaller then s.It follows that the winner set under Top-s-Firstk-Gap does not change.Now assume that W is larger than s.Then, the winner set of Top-s-First-k-Gap has size at least s.Now, removing an alternative c j for j > s cannot create a larger gap between the first s alternatives.It follows that the winner set under First k-Gap after removing c j is still larger then s.This means by definition that the winner set of Top-s-First-k-Gap before and after removing c j was the winner set of Increasing Size Priority.As Increasing Size Priority satisfies Independence of Losing Alternatives, we can conclude that Top-s-First-k-Gap does so as well.

Further Axioms
Finally, we consider three classic axioms of social choice theory, namely Resistance to Clones (Tideman, 1987) and two monotonicity axioms (Zwicker and Moulin, 2016) adapted to the shortlisting setting.
First we consider Resistance to Clones.In many shortlisting scenarios, for example in the context of recommender systems, it is not always clear if alternatives should be bundled together.For example, if we want to select a number of books to recommend, should we include each part of a trilogy separately or bundle the whole series?Shortlisting rules that satisfy Resistance to Clones are useful because the outcome of the rule is the same in both cases (if all parts of the series are equally popular).
and hence c i ∈ R(E).Then after adding one approval to all winning candidates, we have This implies c i ∈ R(E * ).On the other hand, assume sc E (c i ) ≤ α max(sc(E)) and hence It follows that c i ∈ R(E * ).Therefore, Max-Score-f -Threshold satisfies Set Monotonicity for constant f .
• Clearly, adding approvals for all winners can only increase the gap between winners and non-winners.Hence First k-Gap and Largest Gap satisfy Set Monotonicity.
Approval Voting is a special case of First k-Gap and hence also satisfies Set Monotonicity.
• For f -Threshold clearly all winning candidates are still above the threshold in E * and all non-winning candidates remain below the threshold.Hence Set Monotonicity is satisfied.
• Size Priority: It is easy to see that a set {c 1 , . . ., c i } is non-tiebreaking in E if and only if it is non-tiebreaking in E * .Hence, Size Priority satisfies Set Monotonicity.
• Next-k: Let {c 1 , . . ., c } be the winner set.First, let i < .By choice of we have sc It follows that {c 1 , . . ., c } is also the winner set under E * .
It holds that Now consider a set W = {c 1 , . . ., c i }.Consider first i < k.Then we have by the same argument as above Now, by the choice of k we have sc Again, by the choice of k we have (W ) and hence W is still the largest set with maximal q-NCSA-score.
• Finally, consider Top-s-First-k-Gap.Assume first that the set of winners W under First k-Gap is smaller than s.As First k-Gap satisfies Set Monotonicity, W remains the winner set in E * .It is still smaller than s and therefore still the winner under Top-s-First-k-Gap.Now assume that W is larger than s.Then, the winner set of Top-s-First-k-Gap has size at least s.Adding one approval to the first s alternatives does not create a new k gap between them.It follows that the winner set under First k-Gap is still larger then s.This means by definition that the winner set of Top-s-First-k-Gap in E and E * is the winner set of Increasing Size Priority.As Increasing Size Priority satisfies Set Monotonicity, we can conclude that Top-s-First-k-Gap does so as well.
Set Monotonicity is a very natural axiom for many applications, so the fact that First Majority does not satisfy it makes it hard to recommend the rule in most situations.We can strengthen this axiom as follows: a voter that previously disapproved all winning alternatives changes her mind and now approves a superset of all (previously) winning alternatives; this should not change the set of winning alternatives.This is a useful property as it guarantees that if an additional voter enters the election, who agrees with the set of currently winning alternatives but might approve additional alternatives, then the set of winning alternatives remains the same and, in particular, does not expand.
In contrast to Set Monotonicity, only few rules satisfy Superset Monotonicity.Let us first show that First Majority, f -Threshold, Max-Score-f -Threshold, Next-k, Largest Gap and Size Priority do not satisfy Superset Monotonicity.
• Clearly, Superset Monotonicity implies Set Monotonicity, hence First Majority cannot satisfy Superset Monotonicity.
• First, consider an election E with n = 3 such that sc(E) = (2, 1).Then • For Next-k, consider an election E such that sc(E) = (3, 1, 1).Then the winner set under Next-k is {c 1 }.Now, if a voter changes her mind and additionally approves all three alternatives, then all three alternatives become winners under Next-k (for every k > 1).
• For 0.5-NCSA, consider an election E with sc(E) = (90, 90, 67) and n = 98.Here, the winner set is {c 1 , c 2 } with sc q-NCSA E ({c 1 , c 2 }) ≈ 115.97 and sc q-NCSA E ({c 1 , c 2 , c 3 }) ≈ 115.47.However, for sc(E * ) = (91, 91, 68) (one voter who previously approved no one, now approves every candidate), we obtain sc q-NCSA E * ({c 1 , c 2 }) ≈ 118.79 and In contrast, Increasing Size Priority satisfies Superset Monotonicity as any ties between winners remain.Moreover, as the size of the gap between winners and non-winners cannot decrease and gaps within the winner set remain, First k-Gap satisfies Superset Monotonicity for all k (which includes Approval Voting).For this reason Top-s-First-k-Gap also satisfies Superset Monotonicity by an analogous argument as for Set Monotonicity.
In general, the axioms discussed in this section can be seen as axioms about the stability of the winner set under specific changes to the election.We observed that First Majority and, to a lesser degree, Size Priority and Next-k did not perform well in this regard.On the other hand, it seems that the winner set of First k-Gap and Approval Voting are particularly stable, as they are the only rule that satisfies all three axioms considered in this section.

Clustering Algorithms as Shortlisting Methods
Let us briefly discuss the relation between clustering algorithms and shortlisting methods.The goal of shortlisting is essentially to classify some alternatives as most suitable based on their approval score.The machine learning literature offers a wide variety of clustering algorithms that can perform such a classification.
In the following, we describe how any clustering algorithm can be translated into an approval-based variable multi-winner rule that satisfies Anonymity.For most clustering algorithms, the corresponding rule also satisfies Neutrality, Efficiency and is nontiebreaking, and thus yields a shortlisting method.The procedure works as follows: Let E = (C, V ).We use sc(E) as input for a clustering algorithm.This algorithm produces a partition S 1 , . . ., S β of sc(E).The winner set is the partition that contains the highest score, i.e., the winner set consists of those candidates whose scores are contained in the selected partition.
As this procedure is based on sc(E), the resulting approval-based variable multi-winner rule is clearly anonymous.To show that the resulting rule is a shortlisting rule, we require the following two additional assumptions: 1.The clustering algorithm yields the same result for any permutation of sc(E).If this is the case, the resulting rule is also neutral.
2. The algorithm outputs clusters that are non-intersecting intervals.If this is the case, the result rule is non-tiebreaking (since clusters do not intersect).It is also efficient, as the "winning" cluster is an interval containing the largest score.
These are indeed conditions that any reasonable clustering algorithm satisfies.
As an illustration, let us consider linkage-based algorithms (Shalev-Shwartz and Ben-David, 2014).Linkage-based algorithms work in rounds and start with the partition of sc(E) into singletons.Then, in each round, two sets (clusters) are merged until a stopping criterion is satisfied.One important type of linkage-based algorithms are those where always the two clusters with minimum distance are merged.Thus, such algorithms are specified by two features: a distance metric for sets (to select the next sets to be merged) and a stopping criterion.We assume that if two or more pairs of sets have the same distance, then the pair containing the smallest element are merged.Following Shalev-Shwartz and Ben-David (Shalev-Shwartz and Ben-David, 2014), we consider three distance measures: the minimum distance between sets (Single Linkage): the average distance between sets (Average Linkage) and the maximum distance between sets (Max Linkage) These three methods can be combined with arbitrary stopping criteria; we consider two: (A) stopping as soon as only β clusters remain, and (B) stopping as soon as every pair of clusters has a distance of ≥ α.Interestingly, two of our previously proposed methods correspond to linkage-based algorithms: First, if we combine the minimum distance with stopping criterion (A) for β = 2, we obtain the Largest Gap rule.Secondly, if we use the minimum distance and impose a distance upper-bound of α = (stopping criterion B), we obtain the First k-Gap rule.Thirdly, if we seek winner sets of size roughly m /k for some positive integer k, stopping criterion (A) with β = k is a possible choice.
We see that the literature on clustering algorithms yields a large number of shortlisting methods.The inherent disadvantage of this approach is that cluster algorithms generally treat all clusters as equally important whereas for shortlisting methods the winning set of candidates is clearly most important.This difference becomes most pronounced when a clustering algorithm produces several clusters; only the "winning" cluster is relevant for the resulting shortlisting method.That being said, we identified two clustering algorithms that indeed corresponded to sensible shortlisting methods (First k-Gap and Largest Gap), showing that this approach can be fruitful.

Experiments
In numerical experiments, we want to evaluate the characteristics of the considered shortlisting rules.The Python code used to run these experiments is available (Lackner and Maly, 2022)).We use three data sets for our experiments: two synthetic data sets ("bias model" and "noise model") as well as data from a real-world shortlisting scenario, the nomination process for the Hugo awards.

Basic setup
Both synthetic data sets have the same basic setup.We assume a shortlisting scenario with 100 voters and 30 alternatives.Each alternative c has an objective quality q c , which is a real number in [0, 1].For each alternative, we generate q c from a truncated normal (Gauss) distribution with mean 0.75 and standard deviation 0.2, restricted to values in [0,1].This is chosen to model difficult shortlisting scenarios with several strong candidates (with an objective quality q c close to 1).Our base assumption is that voters approve an alternative with likelihood q c .Thus, the approval score of alternatives are binomially distributed, specifically sc E (c) ∼ B(100, q c ).We then modify this assumption to study two complications for shortlisting: imperfect quality estimates (noise) and biased voters.

The noise model
This model is controlled by a variable λ ∈ [0, 1].We assume that voters do not perfectly perceive the quality of alternatives, but with increasing λ fail to differentiate between alternatives.Instead of our base assumption that each voter approves an alternative c with likelihood q c , we change this likelihood to (1 − λ)q c 0.5λ.Thus, for λ = 0 this model coincides with our base assumption; for λ = 1 we have complete noise, i.e., all alternatives are approved with likelihood 0.5.As λ increases from 0 to 1, the amount of noise increases, or, in other words, the voters become less able to judge the quality of alternatives.

The bias model
In this model we assume that a proportion of the voters are biased against (roughly) half of the alternatives; we call these alternatives disadvantaged.Biased voters approve these alternatives only with likelihood 0.5 • q c , i.e., they perceive their quality as only half of their true quality.We assign each alternative with likelihood 0.5 to the set of disadvantaged alternatives.In addition, the alternative with the highest quality is always disadvantaged. 6 We control the amount of bias via a variable γ ∈ [0, 1]: a subset of voters of size 100 • γ is biased; for the remaining voters our base assumption applies.As in the noise model, as γ increases from 0 to 1 the shortlisting task becomes harder as the approval scores less and less reflects the actual quality of alternatives.

The Hugo Awards Data Set
The Hugo Awards are annual awards for works in science-fiction.Each year, awards are given in roughly 20 categories.The Hugo awards are particularly interesting for our paper as the nomination of candidates is based on voting and the submitted votes are made publicly available (this distinguishes the Hugo awards from many other literary awards with confidential nomination procedures).
The Hugo shortlisting (nomination) process works as follows.Each voter can nominate up to five candidates per category.This yields an approval-based election exactly as defined in Section 2. For each category, a shortlist of (usually) six candidates is selected.This shortlist, however, does not necessarily consist of the six candidates with the largest approval scores.Instead, a voting rule called "E Pluribus Hugo" is used.This is not a shortlisting rule in our sense (Definition 1), since it is not Non-tiebreaking and fails Efficiency.7However, "E Pluribus Hugo" generally selects candidates with high approval scores and hence the actual winners are always among the top-seven candidates with the largest approval scores.In Figure 2, we display in which position (when sorted by approval scores) the actual winner in the second stage is found.Note that there are three instances where a candidate in position 7 is winning.As "E Pluribus Hugo" always selects six candidates, this shows that either Non-tiebreaking or Efficiency is violated in these instances.
Our data set is based on the years 2018-2021, comprising a total of 78 shortlisting elections.The voting data for these years is publicly available on the Hugo website https://www.thehugoawards.org/.For each election we recorded the actual winner in the second stage (also based on voting, but with a different, larger set of voters).The data files are available along-side our code (Lackner and Maly, 2022).
In a sense this is an ideal data set to test our results, as the scenario exactly matches our formal model.However, there are two caveats to be noted.First, the true winner 1 |vi| , i.e., voters can contribute at most 1 to the total score of all candidates.Each round the two candidates with the lowest fractional approval scores are selected.Out of these two, the one with the lower approval score is eliminated.This step is repeated with a reduced set of candidates (and updated fractional approval scores) until only six candidates remain.We omit details how ties are handled in this process and refer to Quinn and Schneier (2016), who introduced "E Pluribus Hugo" under the name SDV-LPE.This paper also contains a discussion of why this rule was chosen (in reaction to strategic voting in previous years) and its merits for this specific application.The orthogonal nature of precision and average seen can be seen clearly when comparing Approval Voting and 0.5n-Threshold : Approval Voting returns rather small winner sets (as seen in Figs.3b and 4b), but if λ increases, the objectively best alternative is often not contained in the winner set (Figs. 3a and 4a).0.5n-Threshold has large winner sets, but is likely to contain the objectively best alternative even for large λ (up to λ ≈ 0.8).If the average size of winner sets remains roughly constant (Increasing Size Priority, First Majority, Approval Voting), then the precision reduces with increasing noise/bias (λ).
Size Priority (with the considered priority order) is a noteworthy alternative to Approval Voting.It has an only slightly larger average size (roughly 1 vs 4), while having a significantly larger chance to include the objectively best alternative.As it is generally not necessary to have extremely small winner sets in shortlisting processes, we view Size Priority (with a sensibly chosen priority order) as superior to Approval Voting.
Considering the noise model (Fig. 3), we see a very interesting property of First 5-Gap: it is the only rule where the size of winner sets significantly adjusts to increasing noise.If λ increases, the differences between the approval scores vanishes and thus fewer 5-gaps exist.As a consequence, the winner sets increase in size.This is a highly desirable behavior, as it allows First 5-Gap to maintain a high likelihood of containing the objectively best alternative without producing very large shortlists for low-noise instances.Two other rules also show this behavior: Top-10-First-5-Gap and First Majority, albeit both to only a small degree.Top-10-First-5-Gap achieves the same precision as First 5-Gap until γ reaches ≈ 0.5 after which its precision deteriorates.On the other hand, note that Top-10-First-5-Gap has a considerably smaller average size.For Largest Gap, 0.5n-Threshold , and q-NCSA, we see the opposite effect: winner sets are large for low noise but decrease with increasing λ.This is not a sensible behavior; note that First Majority achieves better precision with much smaller average size.
For the bias model, we do not observe any shortlisting rule that reacts to an increase in bias with a larger average size.
To sum up, our experiments show the behavior of shortlisting rules with accurate and inaccurate voters, and the trade-off between large and small winner set sizes.In these experiments, we see two shortlisting rules with particularly favorable characteristics: 1. Size Priority produces small winner sets with good precision.Thus, it shows a certain robustness to a noisy selection process, as is desirable in shortlisting settings.
2. First k-Gap manages to adapt in high-noise settings by increasing the winner set size, the only rule with this distinct feature.This makes it particularly recommendable in settings with unclear outcomes (few or many best alternatives), where a flexible shortlisting method is required.As we will see in the next experiment, however, First k-Gap on its own can be insufficient, which leads us to recommending the related Top-s-First-k-Gap rule instead.

Experiment 2: Tradeoffs between Precision and Size
In this second experiment, we want to study the tradeoff between precision and size in more depth and for many more shortlisting rules.Here, we put particular emphasis on the Hugo data set (but also consider both synthetic sets).To this end, we represent shortlisting rules as points in a two-dimensional plane with average size as x-axis and precision as y-axis.Figure 5 shows these results for the Hugo data set (points are averaged over 78 instances), Figure 6 shows these results for the noise data set (no noise to moderate noise, i.e., λ ∈ [0, 0.5], yielding 10,000 instances), and Figure 7 for the bias model (also for λ ∈ [0, 0.5], 10,000 instances).These plots can be understood as follows.Ideal shortlisting rules lie in the top left corner (high precision, low average size).As this is generally unachievable, we have to choose a compromise between the two metrics.The gray area shows the space in which such a compromise has to be found (when choosing from shortlisting rules that are studied in this paper).
We will now explain the gray area in more detail: For Experiment 2, we consider all shortlisting rules defined in Section 3 with the following parameters.For α ∈ {0, 0.01, 0.02, . . ., 1}, we consider • Increasing Size Priority with priority orders of the form s £ s + 1 £ . . .(ISP-s) for 2 ≤ s ≤ m, Each shortlisting rule yields a point in this two-dimensional space.Shortlisting rules with one parameter are displayed as lines.We can compute a Pareto frontier consisting of all points that do not have another point above and to the left of it.The boundary of the gray area shows this Pareto frontier.Consequently, voting rules close to this frontier represent a more beneficial tradeoff between precision and average size.

Results for the Hugo data set
When looking at Figure 5, we see as expected that ISP-7 achieves a precision of 1 and an average size slightly above 7 (due to ties).We furthermore see that ISP-4, ISP-5, and ISP-6 are all very close to the Pareto frontier.This raises the question whether Increasing Size Priority is an ideal choice for this data set.While this class is a good choice, it can be improved by Top-s-First-k-Gap.In Table 2, we exemplarily show the precision and average size values for ISP-6, and ISP-7 alongside shortlisting rules that achieve a smaller average size with the same (or better) precision.This table gives an indication how to use Top-s-First-k-Gapin a real-world shortlisting task: First, choose a sensible maximum size of a shortlist; in the case of Hugo awards this was chosen to be six (and was five prior to 2017).Then, identify a bound that constitutes a significant gap; this bound can be chosen conservatively.In the Hugo data set, a sensible choice appears to be 30% of voters.That is, if we encounter a gap (in the sense of First k-Gap) in (E) of more than 0.3n, we cut the shortlist at this point if this leads to a shorter shortlist.
Let us now consider other shortlisting rules.We see that Max-Score-f -Threshold closely traces the Pareto frontier and thus is a very good choice for selecting a compromise between precision and average size.f -Threshold and First k-Gap are less convincing.q-NCSA performs even worse, as very often candidates have approval scores of less than 0.5n.Therefore q-NCSA selects mostly empty sets and is thus not visible in Figure 5 (cf.Observation 2).A notable unparameterized rule is First Majority, which is very close to the Pareto frontier.
To sum up our results for the Hugo data set, we identify the following shortlisting rules as particularly suitable.Top-7-First-α • n -Gap for α ∈ [0.27, 0.30] and Top-7-First-0.69 • max sc(E) -Gap achieve a precision of 1 with the smallest average size (7.051); in Figure 5 these rules correspond to the point labeled "optimal rules".In general, Increasing Size Priority and Max-Score-f -Threshold achieve a very good compromises between precision and average size.

Results for the noise and bias models
Figure 6 shows the results for the noise model.We see that also here Increasing Size Priority and Max-Score-f -Threshold are very close to the Pareto frontier.The same holds for First Majority.A major difference to the Hugo data set is the performance of q-NCSA.As candidates generally have approval scores of more than 0.5n, q-NCSAworks as intended with points close to the Pareto frontier.As before, f -Threshold and First k-Gap are less convincing.
The bias model is a scenario, where some high-quality candidates receive too few approvals.In Figure 7, we see that this is a tough problem.The only recommendable shortlisting rules are Increasing Size Priority rules.By simply shortlisting the top-k candidates, there is a certain chance to also shortlist high-quality but disadvantaged candidates.We remark that the Pareto frontier between ISP-points is due to Top-s-First-k-Gap rules.

Discussion
Based on our analysis, we recommend three shortlisting methods: Size Priority, Top-s-First-k-Gap, and f -Threshold .Let us discuss their advantages and disadvantages: • Size Priority, in particular Increasing Size Priority, is recommendable if the size of the winner set is of particular importance, e.g., in highly structured shortlisting processes such as the nomination for awards.Increasing Size Priority exhibits good axiomatic properties (cf.Table 1) as well as a very solid behavior in our numerical experiments.In particular for the bias data set, where a (unknown) subset of candidates is discriminated against, Increasing Size Priority appears to be the best choice.By selecting k candidates with the highest approval scores (or more in case of ties), the differences in approval scores within the selected group are ignored and thus disadvantaged, high-quality candidates have a better chance to be chosen.On the other hand, Increasing Size Priority makes limited use of the available approval preferences and thus can be seen as a good choice mostly in settings with limited trust in voters' accuracy.When voters are expected to have good estimates of the candidates' qualities, the following two shortlisting rules are better suited.
• Our axiomatic analysis reveals First k-Gap as a particularly strong rule in that it is the minimal rule satisfying -Stability.Furthermore, it is the only rule that adapts to increasing noise in our simulations.However, we have seen in Experiment 2 (Section 6.5) that First k-Gap is prone to choosing winner sets that are larger than necessary.Thus, we recommend to use Top-s-First-k-Gap instead.Top-s-Firstk-Gap shares most axiomatic properties with First k-Gap (cf.Table 1) except -Stability and Resistance to Clones.Another advantage of Top-s-First-k-Gap is that the parameter k is difficult to choose for First k-Gap, whereas it is very reasonable to conservatively pick a large k-value for Top-s-First-k-Gap.Choosing k too large simply diminishes the differences between Top-s-First-k-Gap and ISP-k.
• Finally, Theorem 6 shows that f -Threshold rules are the only rules satisfying the Independence axiom.Therefore, if the selection of alternatives should be independent from each other, then clearly a f -Threshold rule should be chosen.For example, the inclusion in the Baseball Hall of Fame should depend on the quality of a player and not on the quality of the other candidates.In our experiments, we have seen that the related class of Max-Score-f -Threshold rules has advantages over f -Threshold rules.The difference between these two classes, however, is only relevant if the maximum score of candidates differs between elections for reasons unrelated to the candidates' quality.This was the case, e.g., in the Hugo data set, where the relative maximum approval score varied significantly between award categories.
These recommendations are applicable to most shortlisting scenarios.There are, however, possible variations of our shortlisting framework that require further analysis in the future.For example, while strategyproofness is usually not important in election with independent experts, there are some shortlisting applications with a more open electorate where this may become an issue (Quinn and Schneier, 2016;Bredereck et al., 2017).We have not considered strategic voting in this paper and assume that this viewpoint will give rise to different recommendations.Moreover, it may be worth investigating whether using ordinal preferences (rankings) instead of approval ballots can increase the quality of the shortlisting process (shortlisting rules for ordinal preferences can be found, e.g., in the works of Elkind et al. (2017a); Aziz et al. (2017b);Faliszewski et al. (2017); Elkind et al. (2017b)).In general, the class of variable multi-winner rules (and social dichotomy functions) deserves further attention as many fundamental questions (concerning proportionality, axiomatic classifications, algorithms, etc.) are still unexplored.
We observe that the both sides of the equation equal the change of the function x q in an interval of one.Because the derivative of f (x) = x q for 0 ≤ q ≤ 1 is monotone declining, we can bound this change using the slope of f (x) in either the starting or end point of the interval as follows Therefore, we have i q − (i − 1) q ≥ f (i) ≥ (i + 1) q − i q It follows that i+1 k=1 (2sc E (c k ) − n) i q + z ≤ i+1 k=1 (2sc E (c k ) − n) i q + (i + 1) q − i q = sc q-NCSA E ({c 1 , . . ., c i+1 }).
This concludes the proof.

Figure 1 :
Figure 1: Illustration of Example 1 ) and thus Next-1 selects {c 1 , . . ., c i−1 }.The argument for ISP-1 is similar.Finally, consider Top-s-First-k-Gap.If s = m, then W = C. Consequently, |W | ≤ |W | and W is thus the winner set.On the other hand, if k = m then W = C. Hence, |W | ≤ |W | and W is thus the winner set.
U n a n im it y A n t i-U n a n im it y -S t a b il it y D e t e r m in e d I n d e p e n d e n c e I n d .o f L o s in g A lt .R e s .t o C lo n e s S e t M o n o t .S u p e r s e t M o n o t .

Axiom 11 (
Resistance to Clones).Adding a clone of an alternative to an election does not change the outcome, i.e., if E = (C, V ) and E * = (C ∪ {c * }, V * ) are two elections with |V | = |V * | such that, for all j ≤ n, we have c i ∈ v j if and only if

Figure 2 :
Figure 2: Shortlist positions of the actual winners when sorted by approval scores.
(a) Inclusion of objectively best alternative (b) Average size of winner sets

Figure 3 :
Figure 3: Numerical simulations for the noise model (a) Inclusion of objectively best alternative (b) Average size of winner sets

Figure 4 :
Figure 4: Numerical simulations for the bias model