1 Introduction

The single transferable vote (STV) election procedure has been used for multiwinner elections in many countries since the early to mid-20th century. Members of the Australian Senate have been elected using STV since 1948, and members of the Dáil Éireann, the lower legislative house of the Irish legislature, have been elected using STV since 1921. In the 21st century the method has experienced a surge in interest and usage. Many municipalities in the United States currently use the single-winner version of STV, often referred to as instant runoff voting (IRV), for local elections. Such elections include city council races in Minneapolis, MN, Oakland, CA, and San Francisco, CA, as well as primary races for city office in New York City. IRV was even used for the 2020 US Presidential election in the state of Maine. In Scotland, STV has been used for multiwinner local government elections in council areas since 2007, and IRV has been used for a handful of single-winner elections.

While STV has advantages as a voting method, such as its ability to achieve proportional representation in multiwinner elections, the method also has drawbacks. One of its most serious weaknesses is that STV is non-monotonic, where a candidate might be worse off receiving more support from voters (an upward monotonicity anomaly), or a candidate might be better off receiving less support from voters (a downward monotonicity anomaly). That is, the following scenario is possible when using STV: a candidate X wins a seat but if X were ranked higher on some ballots, then X would not win a seat. Similarly, it is possible that X does not win a seat but X would win a seat if they were moved down the rankings on some ballots. Other types of non-monotonicity are also possible. For example, it is possible that X does not win a seat in an election but if fewer seats were available then X would win a seat (a committee size monotonicity anomaly). Also, it is possible that a losing candidate X would have won a seat if some of X’s supporters had abstained from voting in the election (a no-show anomaly).

The purpose of this article is to investigate how often such anomalies occur in real-world elections. To that end, we collected and analyzed the freely available vote data from 1079 Scottish local government elections, 30 single-winner and 1049 multiwinner. All elections used STV (or IRV) to elect a set of winners. For each type of monotonicity anomaly mentioned above, we wrote Python code that searched the ballot data from each Scottish election to determine how many elections demonstrated the anomaly. Our general finding is that monotonicity anomalies occur rarely in these elections, occurring on the order of 1–4% for each type. As far as we are aware this paper is the largest empirical study of monotonicity to date, as the prior (mathematically-oriented) social choice literature has not analyzed this database of Scottish STV elections.

2 Previous literature on the frequency of monotonicity anomalies

Previous literature regarding the frequency with which STV can produce monotonicity anomalies mostly addresses only the single-winner upward case, and very little of this literature is empirical. One empirical analysis (Graham-Squire and Zayatz 2021) considered IRV elections in San Francisco and Alameda County, California between 2008 and 2016, as well as the 2009 mayoral election in Burlington, Vermont. The study found an upward monotonicity anomaly rate of 0.74% (1/135) of all IRV elections, 2.71% (1/37) of IRV elections that went to at least a second round, and 7.7% (1/13) of competitive three-candidate IRV elections. The most comprehensive empirical analysis of US IRV elections that went to a second round (Graham-Squire and McCune 2023) found anomaly rates of 2.2% (upward), 1.6% (downward) and 0.5% (no-show). Additional empirical work tends to focus on a single election of interest, which does not provide insight on anomaly rates (Graham-Squire and McCune 2024; McCune and McCune 2023; Ornstein and Norman 2014).

Semi-empirical research (i.e., research that does not have access to complete ballot preference data) finds small percentages of elections demonstrating anomalies when considering all elections, with estimates of zero (Bradley 1995), 0.028% (Allard 1995), 1.4% (Miller 2017), and 1.5% (Gallagher 2013). For extremely close elections, Miller (2017) found that 33% of elections demonstrate monotonicity failure, and this percentage increases as elections become more competitive. Both Allard (1995) and Bradley (1995) address multiwinner STV elections, but Allard (1995) uses poll data in the absence of complete preference data and considers only very restricted kinds of monotonicity anomalies, and the methodology in Bradley (1995) is not clear. In a semi-empirical analysis, Keskin et al. (2022) found that 20% of past French presidential elections likely demonstrated a monotonicity failure under the voting method of plurality runoff, which is similar to IRV.

Theoretical research into three-candidate IRV elections tends to find a higher frequency of upward anomalies, although the prevalence varies depending on the assumptions of the model and the closeness of the election. Estimates that 1.76–4.51% of all elections would demonstrate upward anomalies are found in Lepelley et al. (1996), where the percentage depends on which model of voter behavior is used. Between 4.5 and 6.9% was found in Quas (2004), whereas Plassmann and Tideman (2014) finds a frequency of less than 1%. Using a different model of voter behavior and a broader definition of monotonicity, Quas (2004) found that the percentage of elections demonstrating anomalies tends to 100% as the number of candidates increases. In elections where the top three candidates all receive more than 25% of the first-place vote, estimates range from as low as 10% (Miller 2017) to 51% in highly competitive elections where the top three candidates are in a virtual tie (Ornstein and Norman 2014).

Some theoretical research has also examined the prevalence of downward and no-show anomalies in three-candidate IRV elections. For downward anomalies, estimates for a lower bound range from 1.97% (Lepelley and Merlin 2001) to 3.8% (Miller 2017). For no-show anomalies, Plassmann and Tideman (2014) found rates of 0.38–0.47%, and Lepelley and Merlin (2001) found rates about 10 times higher, between 4.1 and 5.6%. The former used a spatial model, and the latter utilized the impartial anonymous culture and impartial culture models. In empirical research, Graham-Squire and Zayatz (2021) found a rate of 0% for no-show anomalies in the 135 IRV elections analyzed. There has been no prior theoretical analysis of the frequency of committee size anomalies.

As far as we are aware, there are no documented monotonicity anomalies of any kind in real-world multiwinner elections where full preference data is available and a set of ballots can be found which demonstrate the given anomaly. The reason for the lack of examples is that the database of Scottish elections is the first large set of multiwinner elections with available preference data which has been searched for monotonicity anomalies. All prior documented instances of monotonicity anomalies have occurred in single-winner IRV political elections in the United States, which are listed below.

  • The 2009 mayoral election in Burlington, VT, which demonstrated an upward anomaly (Miller 2017; Ornstein and Norman 2014).

  • The 2020 board of supervisors election in the seventh ward of San Francisco, CA, which demonstrated a downward anomaly (Graham-Squire and McCune 2023).

  • The 2021 city council election in the second ward of Minneapolis, MN, which demonstrated upward and downward anomalies (McCune and McCune 2023).

  • The August 2022 Special Election for the US House of Representatives in Alaska, which demonstrated upward and no-show anomalies (Graham-Squire and McCune 2024).

  • The 2022 school director election in district 4 of Oakland, CA, which demonstrated upward and downward anomalies (McCune 2023).

Our results (Table 10) significantly increase the number of known monotonicity anomalies in real-world elections, and represent the first such documented anomalies in multiwinner elections.

3 Preliminaries: single transferable vote and monotonicity anomalies

The Scottish elections we study use STV to choose the set of election winners. There are different voting methods which can be classified as STV; we use the term “STV” to refer only to the Scottish STV rules, which we outline below.

Let n denote the number of candidates in an election and let S denote the size of the winner set, which equals the number of available legislative seats. In an STV election, each voter casts a preference ballot where the voter provides a preference ranking of the candidates. In Scottish elections voters are not required to provide a complete ranking and thus it is common for voters to rank only a subset of the candidates, leaving some candidates off their ballots. The ballots are combined into a preference profile, which provides a count of how many different kinds of ballot were cast; the preference profile of each election is the data we collected and analyzed. Table 1 shows an example of a preference profile in an election with 501 voters and \(n=4\) candidates A, B, C, and D. The table shows that 19 voters rank A first, B second, and leave C and D off the ballot; the other numbers across the top row convey similar information about the number of voters who cast the corresponding ballot. When discussing a given ballot we use the notation \(\succ \) to denote that a candidate is ranked immediately above another candidate, so that 41 people cast the ballot \(A \succ B \succ C \succ D\), for example. An election is an ordered pair (PS) where P is a preference profile. STV takes an election as input and outputs a winner set, which we denote W(PS).

Table 1 An example of a preference profile with 501 voters

It is difficult to provide a complete definition of STV in a concise fashion. Therefore, we provide a high level description which we illustrate using examples with the preference profile in Table 1. The formal description of the rules can be found at https://www.legislation.gov.uk/sdsi/2007/0110714245.

The method of STV proceeds in rounds. In each round, either a candidate earns enough votes to be elected or no candidate is elected and the candidate with the fewest (first-place) votes is eliminated. The number of votes required to be elected is called the quota, and is calculated by

$$ \text {quota } = \left\lfloor \frac{\text {Number of Voters}}{S+1}\right\rfloor +1. $$

If no candidate reaches quota in a given round then the candidate with the fewest first-place votes is eliminated, and this candidate’s votes are transferred to the next candidate on their ballots who has not been elected or eliminated. If a candidate reaches quota, that candidate is elected and the votes they receive above quota (surplus votes) are transferred in a fashion similar to that of an eliminated candidate, except the surplus votes are transferred in proportion to the number of ballots on which each other candidate appears. To explain how these transfers work, suppose candidate A is elected with a total of a votes and a surplus of \(A_s\) votes (so that \(A_s = a - \) quota), and candidate B is the next eligible candidate on b of these ballots. Rather than receive b votes from the election of A candidate B receives \((A_s/a)b\) votes, resulting in a fractional vote transfer. The method continues in this fashion until S candidates are elected, or until some number \(S'<S\) of candidates have been elected by surpassing quota and there are only \(S-S'\) candidates remaining who have not been elected or eliminated.

We illustrate this description using the preference profile in Table 1 and seat values of \(S=1\) and \(S=2\).

Example 1

When \(S=1\) the quota is \(\lfloor 501/2 \rfloor +1 = 251\) and a candidate must receive a majority of votes to win. No candidate initially receives a majority of first-place votes and thus C, the candidate with the fewest first-place votes, is eliminated. As a result 57 votes are transferred to A, 12 to B, and 40 to D, as displayed in the vote totals for the next round of votes in the left side of Table 2. None of the remaining candidates have reached quota and thus D, who now has 154 votes, is eliminated, causing 56 votes to transfer to A and 146 votes to transfer to B. The STV method declares B the winner, as they have now surpassed quota. Thus, \(W(P,1)=\{B\}\).

A transfer of surplus votes never occurs when \(S=1\). This changes when \(S=2\), as shown in the right table of Table 2. In this case the vote totals in the first two rounds are identical to the \(S=1\) case because no candidate achieves quota in the first round; however, A surpasses quota in the second round and their 24 surplus votes must be transferred. Since C has been eliminated, \(60(24/192)=7.5\) votes are transferred to B, \(75(24/192)=9.375\) votes are transferred to D, and \(57(24/192)=7.125\) votes are removed from the election because the 57 ballots of the form \(C \succ A\) do not indicate which candidate should receive these votes if A is elected or eliminated. Therefore, in the third round B has 162.500 votes and D has 163.375. B is eliminated, causing D to surpass quota with 233.375 votes. Thus, \(W(P,2)=\{A,D\}\).

Note that if D were not to appear on any of the ballots that are transferred when B is eliminated then D would finish with only 163.375 votes, 4.625 votes shy of quota. Since there is still one seat left to fill, D would be elected because they are the only candidate left, and this would be an example where a candidate wins without achieving quota.

Table 2 The top (respectively bottom) table shows the vote totals for each candidate by round, and eventual STV winners, for \(S=1\) (respectively \(S=2\)) seats

As mentioned in the introduction, we are interested in four types of monotonicity anomaly that can occur in STV elections. We now define each type, focusing on the multiwinner context since 97% of the elections in our database satisfy \(S>1\). Because none of the Scottish STV elections produce ties, our definitions assume a unique winner set. A careful theoretical treatment of these anomalies, such as what appears in Elkind et al. (2017), must take ties into account. Such a treatment considers STV as a set-valued method that can output multiple sets of winners, and defines the various monotonicity anomalies accordingly. We avoid the issue of ties, and the corresponding technical notation, due to the empirical nature of our work.

Our first type of monotonicity, which we term committee size monotonicity following terminology in Elkind et al. (2017), was first introduced in Staring (1986). Committee size monotonicity requires that when we increase the number of seats available, every candidate who won a seat under the smaller seat size still wins a seat under the larger seat size. While similar in spirit to the concept of house monotonicity from apportionment theory (Balinski and Young 1982), committee size monotonicity is formally distinct.

Definition 1

(Committee size monotonicity) Given an election (PS), for any \(1 \le i < S\) we have \(W(P,i) \subseteq W(P,S)\).

An election (PS) for which there exists \(1\le S' <S\) such that \(W(P,S') \not \subset W(P,S)\) is said to demonstrate a committee size monotonicity anomaly. Such an anomaly is found in Example 1: note that \(W(P, 1)=\{B\}\), which is not a subset of \(W(P, 2)= \{A, D\}\). It seems paradoxical that B is simultaneously the “best” single candidate when \(S=1\), but not in the “top half” of candidates when \(S=2\).

One reason monotonicity anomalies are of interest to social choice theorists is the paradoxical harm anomalies can demonstrate toward a political candidate or some voters. In this example it is understandable if candidate B, and voters who prefer B, feel treated unfairly by the outcome of the \(S=2\) election. In addition to candidates and voters feeling harmed, in partisan elections (i.e., elections in which candidates belong to a political party) it is also possible for political parties to be harmed. Suppose in this example B belongs to the Scottish Labour Party but A and D belong to the Scottish Conservative Party. Then Labour loses their only seat in moving from \(S=1\) to \(S=2\), and thus the party is harmed as well. Most of the previous literature on monotonicity anomalies implicitly studies nonpartisan elections, focusing only on the candidates and sometimes on the voters affected by an anomaly. Since our study concerns partisan Scottish elections, we also discuss harm to political parties when presenting our results.

We note that any empirical analysis of committee size paradoxes has limitations, in that we cannot know if voters would vote substantially differently if the number of seats available were different. If Example 1 were a real-world election with \(S=2\), we would need to conduct high quality polls to know if B would be the IRV winner when \(S=1\). We do not have access to such poll data for the Scottish elections and thus we use the definition of committee size monotonicity from the previous literature, which assumes the same underlying vote data for each choice of S. For the remaining types of monotonicity, our empirical analysis does not have such limitations. It is theoretically possible that a bloc of voters could be aware of a potential anomaly in a future election, and then vote insincerely to make the anomaly favor their preferred candidate. Practically speaking, however, such an action is extremely unlikely. Voters would only vote insincerely were they aware, ahead of the election, of the particular dynamics that would make an anomaly occur. Even with high-quality polling, such information is virtually impossible to know ahead of timeFootnote 1 and thus any insincere voting (in anticipation of an anomaly) would be more likely to hurt the voters than to help them.

We now define the other three types of monotonicity. All three types have been studied primarily in a single-winner context in which it is assumed that each voter casts a ballot with a complete ranking of the candidates. Adapting definitions to a real-world multiwinner context in which voters often cast partial ballots is not straightforward. First, we state how we handle partial ballots. We adopt the weak order model (Popov et al. 2014) which assumes that a voter who casts a partial ballot is indifferent among candidates left off the ballot, all of which are ranked beneath candidates that appear on the ballot. We use only the preference information provided by the voter, and do not complete partial ballots using statistical inference. In this way we are similar to an office of elections, which does not infer any information on a ballot beyond what a voter communicated.Footnote 2 As discussed in Popov et al. (2014) there are other ways to process partial ballots, but empirical studies regarding STV tend to interpret partial ballots as we do (Graham-Squire and Zayatz 2021; Kilgour et al. 2020; McCune and McCune 2022). Some similar studies which use real-world data to generate simulated elections, however, handle partial ballots in a variety of ways (Popov et al. 2014).

Informally, upward monotonicity states that a candidate who wins a seat should not become a loser by shifting the winning candidate up the rankings on some ballots and leaving the relative rankings of the other candidates unchanged. Because we use the weak order model for partial ballots, “shifting a winner up the rankings” includes scenarios where the winning candidate does not appear on the actual ballots and we place that winner at the first ranking on these ballots, shifting all other candidates down. One can also allow the shifting of a winning candidate to a position other than first place. In our empirical work, though, all issues we found with upward monotonicity could be caused by moving a winning candidate to a first-place position. The formal definition below, however, allows for all forms of shifting a winner up in the rankings. We choose the term “upward monotonicity” to accord with the literature for the single-winner case; it is also referred to as candidate monotonicity in Elkind et al. (2017).

Definition 2

(Upward monotonicity) Given an election (PS), let \(X \in W(P,S)\) and let \(\mathcal {B}\) be a set of ballots from P. If we construct a new preference profile \(P'\) from P by moving X to a higher position in the ballots from \(\mathcal {B}\) but leave unchanged the relative positions of all other candidates on the ballots from \(\mathcal {B}\) then \(X \in W(P', S)\).

An election is said to demonstrate an upward monotonicity anomaly if there exists a winning candidate X and a set of ballots \(\mathcal {B}\) such that moving X to a higher position on the ballots from \(\mathcal {B}\), but leaving the relative positions of the other candidates unchanged, creates a preference profile in which X loses.

Informally, downward monotonicity states that a candidate who does not win a seat should not become a winner by shifting the candidate down the rankings on some ballots and leaving the relative rankings of the other candidates unchanged. Because of partial ballots, downward monotonicity is more difficult to define in a real-world context. For example, suppose candidate A does not win a seat but A would win a seat if we take 10 ballots with A ranked first and no other candidates listed on the ballot (we refer to such ballots as bullet votes for A) and change those ballots to \(B \succ A\). Under the weak order model, adding B to the ballot in this manner changes the relative ordering of the candidates besides A, and thus such an outcome would not count as a violation of downward monotonicity under a traditional definition. However, this scenario fits the spirit of a downward monotonicity violation. To account for this issue of partial ballots, we adapt the classical single-winner definition of downward monotonicity into strong and weak forms. The strong form requires that the relative rankings of candidates besides the affected losing candidate are unchanged (similar to the classical notion of downward monotonicity), whereas the weak form allows for situations in which we change bullet votes.

Definition 3

(Downward monotonicity) Given an election (PS), let \(X \not \in W(P, S)\) and let \(\mathcal {B}\) be a set of ballots from P such that X appears on all ballots in \(\mathcal {B}\).

  • Strong Downward Monotonicity: If we construct a new preference profile \(P'\) from P by moving X to a lower position in the ballots from \(\mathcal {B}\) but leave unchanged the relative positions of all other candidates on the ballots from \(\mathcal {B}\) then \(X \not \in W(P', S)\).

  • Weak Downward Monotonicity: Let \(\mathcal {B}_1\) and \(\mathcal {B}_2\) be a partition of \(\mathcal {B}\) such that \(\mathcal {B}_2\) consists of bullet votes for X. If we construct a new preference profile \(P'\) from P by moving X to a lower position in the ballots from \(\mathcal {B}_1\) but leave the relative positions of all other candidates on the ballots from \(\mathcal {B}_1\) unchanged, and we change all ballots in \(\mathcal {B}_2\) to bullet votes for Y or to ballots of the form \(Y \succ X\) for some candidate \(Y\ne X\), then \(X \not \in W(P', S)\).

A downward monotonicity anomaly, either strong or weak, is defined similarly to an upward monotonicity anomaly.

When \(S=2\), the election in Table 1 contains both an upward and a strong downward monotonicity anomaly. To demonstrate the upward anomaly, observe that if six voters who cast the ballot \(D\succ A \succ C\) move A, who is a winner in the original election, up one ranking so that the 6 ballots become \(A \succ D \succ C\), then A no longer wins a seat. As illustrated in the left example of Table 3, even though A receives more votes initially, shifting A up on those 6 ballots causes D to be eliminated first instead of C and the winner set changes from \(\{A,D\}\) to \(\{B,C\}\). That is, as a result of 6 voters being persuaded that A is their top choice rather than their second-favorite, A becomes a losing candidate because the order of elimination changes. Note that for this outcome to count as an anomaly we simply need A to drop from the winner set; the simultaneous removal of D is an unfortunate side effect for this candidate. If moving A up on some ballots causes D to lose but A remains a winner, we do not say that an anomaly occurred.

Table 3 The left (respectively right) table demonstrates an upward (respectively downward) monotonicity anomaly for the election (P, 2) from Example 1

To demonstrate a strong downward monotonicity anomaly, suppose 6 voters who cast the ballot \(B\succ C \succ A\) in the original election cast the ballot \(C\succ B \succ A\) instead, moving B down one ranking. As illustrated in the right example of Table 3, D is eliminated first and the winner set is \(\{B,C\}\) for the modified election. If B were moved down one ranking on this handful of ballots, B would have been an election winner rather than a loser.

We now define our final type of monotonicity, participation monotonicity, and its corresponding type of anomaly, a no-show anomaly (this is also sometimes referred to as an abstention paradox). Informally, participation monotonicity requires that voters are better off casting ballots than abstaining from the election. This is succinctly stated in Kacprzyk et al. (2020): “it should always be better to vote honestly than not to vote at all.” The notion of a no-show anomaly has been formally defined in different ways in the context of single-winner elections. For example, Felsenthal and Nurmi (2019) states (harkening back to Moulin (1988)), “The no-show paradox occurs whenever a group of identically minded voters is better off abstaining than by voting according to its preferences.” In such a definition, the group of voters affected by the paradox must all cast the exact same ballot. Other definitions relax this assumption. Consider the definition from Holliday and Pacuit (2023): “if a candidate x is the winner in an initial election, then if we add to that scenario some new voters who rank x above y, then the addition of these new voters should not make y the winner.” Under this definition, voters affected by the anomaly need not cast identical ballots, they merely must agree that they prefer x to y.

We are unaware of previous attempts to define participation monotonicity in a multiwinner context in which voters cast preference ballots. Definitions have been proposed for multiwinner elections which do not use preference ballots (Sánchez-Fernández and Fisteus 2017), but such definitions do not easily translate to the STV setting. We choose to adapt the definition from Holliday and Pacuit (2023), but multiwinner elections contain subtleties which complicate attempts to formalize the sentiment “it should always be better to vote honestly than not to vote at all.” The reason is that, as argued in Ratliff (2006), a voter’s preferences about winner sets cannot always be distilled into a preference ranking of the individual candidates. For example, suppose in a three-seat election a voter casts the ballot \(A\succ B \succ C \succ D \succ E\succ F\). From this ranking it is clear that the voter prefers a winner set of \(\{A,B,C\}\) to \(\{D,E,F\}\), but does this voter prefer \(\{A,C,F\}\) to \(\{B,C,E\}\)? Given only the voter’s preference ranking of the candidates, we cannot say. A more pertinent question when trying to define a no-show anomaly is: does this voter prefer \(\{A,B,D\}\) to \(\{A,B,E\}\)? Suppose that when the voter participates in the election the winner set is \(\{A,B,E\}\) but when they abstain the winner set is \(\{A,B,D\}\); is the voter necessarily worse off when they cast a ballot? We choose to say the answer is Yes; however, it is conceivable that the voter would prefer \(\{A,B,E\}\) to \(\{A,B,D\}\), perhaps because of the group dynamics of the three candidates.

In addition to the concerns outlined above, there are computational challenges when searching for no-show anomalies in actual data. For these reasons, we focus on winner changes among only the two candidates x and y from the definition in Holliday and Pacuit (2023). Thus our definition of a no-show anomaly insists that if voters who prefer x to y abstain rather than vote, the only change to the winner set is that x replaces y. Other definitions, either more or less restrictive, are also sensible.

Definition 4

(Participation monotonicity) Let (PS) be an election, with \(X \not \in W(P, S)\) and \(Y \in W(P, S)\). Let \(\mathcal {B}\) be a set of ballots on which X is ranked higher than Y. Then if we remove the ballots in \(\mathcal {B}\) from the election, it should not be the case that the resulting winner set is \((W(P, S) -\{Y\}) \cup \{X\}\).

A no-show anomaly is said to occur in an election (PS) if there exists \(X \not \in W(P, S)\), \(Y \in W(P, S)\), and a set of ballots \(\mathcal {B}\) on which X is ranked higher than Y such that if the ballots from \(\mathcal {B}\) were removed from the preference profile then X replaces Y in the winner set. No-show anomalies, first introduced in Fishburn and Brams (1983), can be considered extreme versions of the truncation paradox, wherein certain voters benefit from not providing their entire preference ranking. The truncation paradox was first described in Fishburn and Brams (1984).

When searching for no-show anomalies we look for instances of the anomaly that are unambiguous. Specifically, we try to find instances in which candidate X is ranked in the top S candidates on the affected voters’ ballots, and Y is either not present on the ballots or is not ranked in the top S candidates. Such an outcome seems like the clearest way to demonstrate that voters would have created a more desirable electoral outcome by abstaining.

Our running example (P, 2) demonstrates a no-show anomaly: if 35 voters who cast the ballot \(B \succ C \succ A\) are removed from the election, creating the preference profile \(P'\), then \(W(P',2)=\{A,C\}\). These 35 voters prefer C to D, yet when they cast a ballot D is a winner, and when they abstain D is replaced by C in the winner set. In this example the voters removed from the election cast identical ballots but for our definition of a no-show anomaly, it is only relevant that the voters prefer C to D. Furthermore, this is an unambiguous no-show anomaly, as these voters rank C in their top two and thus are worse off when D (who does not appear on their ballots) replaces C in the winner set.

To conclude this section we note that these four types of monotonicity are logically independent. That is, an election which contains an upward anomaly may not contain a downward or a committee size anomaly, for example. An election such as our running example which demonstrates all four types of anomaly is most likely extremely rare. We found no examples of a Scottish election that exhibits all four anomalies, although there are elections which demonstrate three of the four. Before providing our results about the frequency of monotonicity anomalies in real-world elections, we discuss our sources of data and how we searched the data for anomalies.

4 Data sources: Scottish local government elections

For the purposes of local government, Scotland is partitioned into 32 council areas, each of which is governed by a council. The councils provide a range of public services typically associated with local governments, such as waste management, education, and building and maintaining roads. The council area is divided into wards, each of which elects a set number of councilors to represent the ward on the council. The number of councilors representing each ward is determined primarily by the ward’s population, although other factors play a role.Footnote 3 Every 5 years each ward holds an election in which all seats available in the ward are filled using the method of STV.

Every Scottish ward has used STV for local government elections since 2007. Preference profiles from the 2007 elections are difficult to obtain; we contacted several council election offices and either received no response or were told that the 2007 data is not available. Thus there are no elections from 2007 in our database. We obtained preference profile data for the 2012 and 2017 ward elections from the Local Elections Archive Project (Teale 2020), although some of this data is still available on various council websites. We obtained data for the 2022 preference profiles from the council websites.

In addition to the regularly scheduled local government elections which occur on a five-year cycle, council areas sometimes hold off-schedule by-elections to fill a seat that is open due to the death or resignation of a councilor. These by-elections are almost always single-winner IRV elections. The data for many of these elections is not available because some councils hand-count these ballots instead of using the STV tabulation software that is used for the regularly-scheduled elections. We obtained preference profiles for the available by-elections from various council websites, and by request from several council election offices.

In all, we collected the preference profile data of 1079 STV elections, 30 single-winner and 1049 multiwinner. While we would prefer to have preference data from all Scottish local government elections, including 2007 elections and all off-schedule by-elections, the database we use is large enough to make robust conclusions about the frequency of monotonicity anomalies in real-world STV elections.

As mentioned in Sect. 2, this collection of actual ballot data is what sets our study apart from most prior empirical and semi-empirical research on monotonicity anomalies. For each election in our database we have a complete record of the preference ranking of candidates expressed by each voter, which means that we do not need to rely on surveys or other such tools to search for monotonicity anomalies. When we detect an anomaly, we can explicitly describe how the anomaly occurs. In the case of an upward or downward anomaly, for example, we can show exactly how to alter the ballots to demonstrate the anomaly.

Table 4 The number of elections in the database of 1079 elections with the given number of seats
Table 5 The number of elections in the database of 1079 elections with the given number of candidates

We conclude this section by providing basic information about the number of voters, candidates, seats, and voter behavior in these Scottish elections. Across all elections the minimum number of votersFootnote 4 in an election is 661, the maximum is 14,207, and the median is 4790. Thus the electorates under consideration are not tiny, but they tend to be much smaller than electorates in many other publicly accessible databases of elections that use preference ballots. For example, the city of Minneapolis, Minnesota uses IRV to elect a single city councilor from each of its 13 wards. In the 2021 Minneapolis city council electionsFootnote 5 the median number of voters across the wards was 11,326, more than double the median from the Scottish elections. Electorates from other American IRV elections in places such as New York City or the state of Maine tend to be much larger.

Table 4 (resp. 5) shows a breakdown of the number of elections by number of seats (resp. candidates). The number of seats for elections in the database tends to be 3 or 4; there was no election with \(S>5\). The number of candidates ranges from 3 to 14, although the majority of elections have 6, 7, or 8 candidates.

In Scottish local government elections voters are not required to provide a complete ranking of all the candidates, and thus many ballots contain only a partial ranking (often referred to as ballot truncation). When we process the ballot data we assume that a voter prefers any candidate ranked on their ballot to any candidate not ranked on their ballot and we make no inference as to how the voter would have ranked candidates left off their ballot. It is possible that our results would change if the ballots were processed differently; we handle the ballots as we do because we prefer to consider precisely the ranking information provided by the voters. We note that ballot truncation is more the norm than an aberration in Scottish elections. Specifically, the average voter casts a ballot which ranks fewer candidates than seats to be elected, and many fewer than the number of available candidates. Table 6 shows the average number of candidates ranked (the ballot length) for elections with a given number of seats; the median ballot length was 3 for any number of seats. To get a sense of the relationship between average ballot length and the number of candidates, Table 7 shows that as the number of candidates increases in a 4-seat election, the average ballot length also generally increases. The growth is quite slow, however. In elections with 7 or more candidates, the average voter ranks less than half of the candidates. In 4-seat elections, the median ballot length was 3 for any number of candidates.

Table 6 Average number of rankings for the given number of seats in an election
Table 7 Average number of rankings with the given number of candidates in 4-seat elections

5 Methodology: how we search for monotonicity anomalies

In this section we provide a high level description of the code we created to search for monotonicity anomalies. The code is available at Redacted (2022), and is adapted from programs used in Graham-Squire and Zayatz (2021).

Searching for committee size anomalies is straightforward: calculate \(W(P,S')\) for \(1\le S' <S\) and check if \(W(P,S') \subset W(P,S)\). If an election contains a committee size anomaly then such code definitely finds it.

Searching for the other anomalies is much more difficult, as the code must look for a set of ballots which demonstrate the given anomaly. Unless \(S=1\) and \(n=3\) (which occurs in none of our elections) there are no known necessary and sufficient conditions for an election to demonstrate a given anomaly, and therefore if an anomaly exists we cannot guarantee that our code will find it. We use two different kinds of program to search for anomalies, as outlined below. The first type of code is sophisticated in that it analyzes the votes-by-round table to try to engineer an anomaly by changing the order in which candidates are elected or eliminated. The second type of code uses brute force by systematically changing batches of ballots.

The first program works as follows. At each round of the election, it looks for modifications to the preference profile (raising or lowering a candidate’s ranking, or eliminating certain ballots) that could change the order of elimination or candidates being elected in the original election, and then checks to see if the modified profile would result in appropriately different winners. We provide a more detailed description of the upward monotonicity program; the downward and no-show programs are conceptually similar.

The upward monotonicity program first runs the original STV election and calculates the winner set W(PS) and the set E of eliminated candidates, in order of elimination. Let \(\mathcal {C}\) denote the set of candidates in the election and \(E_1\) be first eliminated candidate, \(E_2\) the second eliminated, etc. The program has two phases, one where it searches for anomalies with changes in the elimination order, and the other where it looks for anomalies related to changes in the manner in which candidates are elected. To look for changes in the elimination order, the code chooses a winner \(W_m \in W(P,S)\), and a candidate \(C_i\) in \(\mathcal {C}-\{W_m, E_1\}\). The program checks for ballots with \(C_i\) listed first where the following would happen: \(W_m\) could be raised higher in enough ballots so that \(C_i\) would be eliminated before \(E_1\), without first making \(W_m\) surpass quota. If such ballots exist, the program shifts \(W_m\) to the top of all such ballots and reruns the election with the modified profile \(P'\). If \(W_m\) is not in \(W(P', S)\), then the program reports an anomaly. The program then reverts back to the original profile P and checks all other \(C_k\) for a given \(W_m\), then chooses a different \(W_j\) and repeats the process until all \(W_m\) and \(C_i\) have been exhausted at the level of n candidates. At this point, the program eliminates candidate \(E_1\) to get a new profile \(P_{n-1}\), and repeats the process above for the second eliminated candidate \(E_2\), remaining winners \(W_m\), and remaining candidates \(C_i\). The program continues eliminating candidates and checking all possible changes of elimination order until all eliminated candidates are exhausted. If an anomaly is reported at this stage then it is possible that the program has returned a false positive, which occurred a few times.

In the second phase to look for changes in the seat order, the program reverts back to the original election, and then runs the election to the point where a candidate \(W_1\) is elected. The code then tries to move enough ballots so that \(W_1\) is not elected in that round of the election by doing the following: it chooses a different winner of the election, \(W_2\), who was elected after \(W_1\), and finds ballots where \(W_2\) is ranked second and \(W_1\) is ranked first, and modifies the ballots so that \(W_2\) moves above \(W_1\), thus reducing the number of first-place votes that \(W_1\) has. Once those ballots are modified, the code reruns the election to see if \(W_2\) is still a winner. If not, the code reports an upward monotonicity anomaly. If so, the code reverts back to the original election data and repeats the process with all other such combinations of winners.

The second type of program uses brute force to make small, incremental changes in the ballot data. To search for upward anomalies, the code works as follows. For each winner \(W_i\) and each other candidate C, the code finds ballots on which C is ranked first and, ballot-by-ballot, shifts \(W_i\) up to the first ranking. The code starts with ballots on which \(W_i\) ranked second, then moves on to ballots on which \(W_i\) is ranked third, etc. The code also tries other orderings of ballots to use. For example, we also ran code which started the ballot swaps with ballots on which C is ranked first and \(W_i\) is ranked second, then moved on to bullet votes for C. If the code finds a set of such ballot changes which produces an election in which \(W_i\) is no longer a winner, an anomaly is reported. The downward and no-show programs are implemented similarly.

The brute-force code found most of the upward and downward anomalies reported by the more sophisticated code, and the sophisticated code found all anomalies reported by the brute-force code. Thus, for the case of downward and upward anomalies the brute-force code turned out to be redundant. The case of no-show anomalies is much different: the brute-force code found several no-show anomalies not found by the first kind of code. The reason is that, as outlined below, no-show anomalies can occur in STV elections without changing the order of election or elimination of candidates (this is in stark contrast to the single-winner case, in which any type of anomaly occurs by changing the order of elimination).

While we cannot guarantee that we have found all anomalous elections, we did the following to verify our results:

  • To ensure the programs would find different varieties of how the anomalies present, all programs were tested on sample elections we created that had a wide variety of anomalies.

  • All anomalies reported in this paper were discovered by our programs and then double-checked by hand to guarantee the anomalies actually occur.

  • We looked at the votes-by-round tables (tables of the form provided in Table 2) for all 1079 elections to see if the vote totals in one of the rounds suggested that an anomaly might be present. If so, we attempted to find anomalies by hand for those elections. We were unable to find any anomalous elections in this tedious, manual fashion beyond what our code found.

  • Our programs processed the elections over the course of several weeks across multiple computers, and thus the computation time dedicated to searching for anomalies was substantial.

  • Similar programs have been used to find anomalies in single-winner ranked choice voting, and no anomalous elections have been found beyond those discovered by the programs in Graham-Squire and Zayatz (2021) and Graham-Squire and McCune (2024).

Based on the steps described above, we believe that we have found all, or almost all, of the Scottish STV elections which demonstrate a monotonicity anomaly.

6 Results

Of the 1079 elections in the database we found a monotonicity anomaly of some type in 62 of them, 61 multiwinner and one single-winner. Table 10 summarizes our findings, providing a list of all elections which contain an anomaly and indicating which anomalies we found in each election. Complete details regarding each anomaly are available in the Appendix of Redacted (2023). Recall that these elections are partisan, meaning that each candidate runs as a member of a political party or runs as an independent, and thus we also provide information about when an anomaly affects a political party. We use the following acronyms for the major Scottish political parties: Conservative (Con), Green (Grn), Independent (Ind), Labour (Lab), Liberal Democrats (LD), and Scottish National Party (SNP).

6.1 Committee size monotonicity anomalies

There are nine elections which demonstrate a committee size monotonicity anomaly, accounting for only \((9/1049)=0.86\%\) of the multiwinner elections in the database. Since we can definitively check for instances of this anomaly for a given election, we conclude that such anomalies should occur very infrequently in practice.

While nine is a small sample size, these elections lead to several observations about committee size monotonicity anomalies in actual elections. First, a political party is harmed by this anomaly in only four elections. For example, in the 2012 Dundee City Ward 5 election the candidate McIrvine of the Labour Party loses their seat in the increase from \(S=2\) to \(S=3\), but the Labour Party receives exactly one seat for both values of S. Thus from the party’s perspective no harm was done. By contrast, in the 2017 East Dunbartonshire Ward 4 election Labour receives one seat when \(S=3\) but receives zero seats in the actual election when \(S=4\). From the perspective of political parties the rate of committee size anomalies is \(4/1049= 0.38\%\), suggesting that this anomaly should not be of concern to parties in real-world elections.

Second, in theory these anomalies can be quite extreme, in the sense that if an election contains enough candidates then it is possible that \(W(P,S-1)\) and W(PS) are not only different, but also disjoint. We do not see such outlandish outcomes in the actual data, although we did find one election (2017 Moray Ward 3) where the IRV winner is not a member of the winner set W(P, 3). Our findings suggest that in real-world elections, when this anomaly occurs a single candidate loses their seat when \(S-1\) seats is increased to S seats.

Third, our code did not find any other type of anomaly in these nine elections. Thus our hypothetical example from Sect. 3 which demonstrates all four anomaly types represents a purely theoretical possibility.

6.2 Upward monotonicity anomalies

We found 23 elections which demonstrate an upward monotonicity anomaly, accounting for \(23/1079 = 2.1\%\) of the elections in the database. Twenty-two of the elections are multiwinner, providing a rate of \(22/1049=2.1\%\) for elections with \(S\ge 2\), and only one of the elections is single-winner, providing a rate of \(1/30 =3.3\%\) for IRV elections.

We now analyze how upward anomalies can cause harm to a political party. Consider an election in which Y is a losing candidate and X wins a seat. Suppose that an upward monotonicity anomaly is present such that Y could have won a seat by carefully campaigning for X so that X rose in certain ballots. That is, raising X in those ballots causes X to lose their seat to Y in the resulting modified preference profile \(P'\). We say that Y is harmed by the presence of that upward anomaly, and if a political party wins more seats in the modified election \((P',S)\) than in the original election (PS), we say that this party has been harmed.

We found fifteen elections in which a political party was harmed by an upward anomaly. For example, in the 2022 Highland Ward 13 election, if MacKintosh of the Green Party were ranked higher on some ballots then Fraser of the Labour Party would replace MacKintosh in the winner set, suggesting that Labour should have done some carefully targeted campaigning for the Green Party. None of the examples found were as extreme as the hypothetical example from Sect. 3. In that example, if 6 voters who cast the ballot \(D \succ A \succ C\) swapped A and D at the top of their ballots, then these voters would have caused both A and D to lose their seats, perhaps causing a party to lose two seats. There were no anomalies in the data where a set of voters would have caused their top \(K\ge 2\) favorite candidates to lose their seats if those candidates were rearranged at the top of the voters’ ballots.

We note that a monotonicity anomaly can sometimes illustrate just how “close” an election is. For the 2012 Aberdeenshire Ward 18 contest, in the original election candidate Samways received the fewest first place votes and was eliminated in the second of nine rounds. However, if the winning candidate Christie were moved up on some ballots, then Christie would eventually lose a seat and be replaced by Samways in the winner set. It seems odd that a candidate seemingly as weak as Samways could end up winning a seat through an upward anomaly, which we interpret as a sign of this election’s competitiveness.

Of the 23 elections demonstrating an upward anomaly, 19 also demonstrate a no-show anomaly and five also demonstrate a downward anomaly. For only three of the 23 elections could we not find some other type of monotonicity anomaly. While 23 is a small sample size, this suggests that upward anomalies tend to occur in conjunction with other anomalies in real-world STV elections.

6.3 Downward monotonicity anomalies

Seventeen elections demonstrate a downward monotonicity anomaly, eight strong and nine weak. All of these anomalies occur in multiwinner elections, giving a rate of \(17/1049=1.6\%\) for downward anomalies when \(S \ge 2\). For strong anomalies, the rate drops to \(8/1049=0.8\%\). One election demonstrates only downward and upward anomalies, two elections demonstrate only downward and no-show anomalies, and four elections demonstrate upward, downward, and no-show anomalies. The other ten elections exhibit only a downward anomaly.

In an election with a downward anomaly, it is clear which candidate and party (if any) have been harmed: if a candidate could have won a seat by being moved down on some ballots then this candidate is harmed by the anomaly, and if a party could have gained seats by having one of their candidates moved down on some ballots then the party is harmed as well. Of the 17 elections demonstrating downward anomalies, a political party was harmed in twelve of them. The Conservative Party is the most affected by downward anomalies, with that party not winning a seat in six of the twelve elections as a result of this anomaly. For example, in the 2017 Argyll and Bute Ward 8 election, the Conservative Party did not win a seat in the original election but would have won a seat if their candidate Wallace were moved down on some ballots.

As with the upward anomalies, none of the documented downward anomalies are as extreme as the hypothetical example from Sect. 3. We found no elections in which there exists a set of voters whose ballots start with \(A\succ B\) and both A and B do not win a seat, but if A were moved down on these ballots then both A and B win a seat. However, a few of the strong downward anomalies occur in a fashion we have not observed before. In a “typical” downward anomaly from prior literature, a losing candidate A loses in the penultimate round to another candidate B, but when A is shifted down on some ballots then A is able to win by changing the elimination order so that A no longer faces B in that penultimate round. Our results show that downward anomalies in multiwinner elections can exhibit much different dynamics. For example, in the 2022 Perth and Kinross Ward 4 election Murray loses to Williamson by approximately 13.4 votes in the penultimate round, as shown in Table 8. If we shift Murray down one ranking on 37 ballots of the form Murray \(\succ \) McDougall then Murray still faces Williamson in the penultimate round but now Murray beats Williamson by approximately 7.74 votes. This anomaly occurs by swapping McDougall and Metcalf in the elimination order, but otherwise the order of elimination and election remains the same. It is strange that eliminating McDougall in the fourth round and eliminating Metcalf in the sixth round results in Williamson winning a seat, but eliminating McDougall in the sixth round and eliminating Metcalf in the fourth round results in Murray winning a seat. Some other examples of downward anomalies in our data are similarly strange when compared to downward anomalies from prior literature.

We do not have any insight into why strong downward anomalies occur with much lower frequency than upward anomalies in the Scottish data. This empirical finding is consonant with prior work such as Graham-Squire and McCune (2023), Lepelley et al. (1996) and Miller (2017), which show that upward anomalies occur more frequently in IRV elections than strong downward anomalies.Footnote 6

Table 8 The strong downward monotonicity anomaly in the 2022 election in the Highland Ward of the Perth and Kinross council area

6.4 No-show anomalies

We found 39 elections which demonstrate a no-show anomaly, accounting for \(39/1079 = 3.6\%\) of the elections in the database, and a political party was harmed in 29 of them. The Labour Party is the most affected by this anomaly, with 12 of the 29 elections featuring a losing Labour candidate who would have won a seat if some of their supporters abstained (and this candidate would not simply replace another Labour candidate in the winner set). Thirty-eight of the 39 elections are multiwinner; we found a no-show anomaly in only one of the single-winner elections. Nineteen of the elections also demonstrate an upward anomaly and six also demonstrate a downward anomaly.

For 2 of the 38 multiwinner elections demonstrating a no-show anomaly we could not find a set of ballots to remove such that the affected candidate is ranked in the top S rankings on all removed ballots. These elections are marked with a “letter b” in Table 10. For example, in the 2022 Fife Ward 10 election if we remove 93 ballots on which the losing candidate Smart is ranked above the winning candidate Leslie then Smart replaces Leslie in the winning set, but for some of these ballots Smart is not ranked in the voters’ top four candidates.

Table 9 (Top) The votes-by-round table for the 2017 council election in the seventh ward of the Shetland Islands council area

We found eighteen elections which demonstrate only a no-show anomaly. An example is the 2017 election in the seventh ward of the Shetland Islands, the votes-by-round table for which is displayed in the top of Table 9. Note that Valente does not win a seat while Westlake does. If we remove one ballot of the form Campbell \(\succ \) Valente \(\succ \) Smith \(\succ \) Wishart \(\succ \) Westlake then the election plays out as shown in the bottom of Table 9. By removing this ballot we delay the election of Campbell by one round, which eventually allows Valente to win narrowly over Westlake. Prior to our analysis of the Scottish database, there were no known real-world elections demonstrating only a no-show anomaly (based on the ballot data for the Shetland election, it is straightforward to show that we cannot find any other anomaly in this election.)

Table 10 The one single-winner (out of 30) and 61 multiwinner (out of 1049) elections which demonstrate an anomaly

7 Discussion: multiwinner versus single-winner anomalies

Monotonicity anomalies (besides committee size anomalies) in the single-winner case have been well-studied for elections with a small number of candidates (Table 10). Prior research has shown that for a single-winner election to demonstrate an anomaly, the following conditions must be met.

  • The election contains at least three “viable” candidates. For example, as discussed in Miller (2017) and Ornstein and Norman (2014), for an upward anomaly to occur in a 3-candidate election each candidate must earn at least 25% (and less than 50%) of the first-place votes.

  • The anomaly occurs by modifying ballots in such a way that the order of elimination of candidates changes, resulting in different candidates surviving to subsequent rounds.

The single-winner 2017 by-election from the Perth City South Ward in the Perth and Kinross council area provides a typical example of an upward anomaly. The election contained six candidates; the preference profile in Table 11 shows the ballot data after the bottom three candidates were eliminated, leaving surviving candidates Barrett (B), Coates (C), and Leitch (L). For convenience, we combine ballots of the form \(C_1 \succ C_2 \succ C_3\) with ballots of the form \(C_1 \succ C_2\), as each type of ballot conveys the same ranking information. Table 11 shows that the first-place vote totals are 1733, 1762, and 1883 for B, C, and L, respectively. Thus Barrett is eliminated and, after the vote transfers, Coates wins with 2381 votes to Leitch’s 2227. If we change 151 bullet votes for Leitch to ballots of the form Coates \(\succ \) Leitch, then Coates is no longer the winner of the resulting election. The reason is that in the modified election Leitch (instead of Barrett) is eliminated first, and the vote totals in Table 11 show that Coates would then lose head-to-head against Barrett.

This election exhibits a no-show anomaly in a similar fashion. If we remove 151 ballots of the form Leitch \(\succ \) Barrett \(\succ \) Coates then in the resulting election Leitch is eliminated first and Barrett still has enough support to defeat Coates head-to-head in the final round. The 151 voters who cast these ballots would have achieved a more desirable electoral outcome (obtaining their second choice instead of their third) if they had not participated in the election.

Table 11 The preference profile for the 2017 by-election in the tenth ward of the Perth and Kinross council area, after eliminating the bottom three candidates

This election contains the hallmark features of an IRV election demonstrating an anomaly as outlined above. All three candidates are “strong” in the sense that their first-place votes totals surpass 25% of the total votes, and we engineer an anomaly by changing the order of elimination so that the set of candidates surviving in the last round is different from that in the original election. All other documented examples of real-world IRV elections demonstrating upward, downward, or no-show anomalies display a similar dynamic (Graham-Squire and McCune 2024).

This example (and prior research) raises the questions: in the multiwinner case, must the election contain \(S+2\) “viable” or “strong” candidates? Must an anomaly occur by changing which candidates survive to a subsequent round? Interestingly, the answer to both questions is No. To show why, consider the example of the 2017 election in the ninth ward of the Aberdeenshire council area, the votes-by-round table for which is shown in the top of Table 12. All candidates except Morgan win a seat, and there are not \(S+2\) viable candidates. If we shift Morgan down to the second ranking on 17 Morgan \(\succ \) Davidson ballots, the election unfolds as shown in the bottom table of Table 12. After the change in ballots, Davidson makes quota in the first round and thus is ineligible to receive vote transfers from the election of Owen, and as a result Morgan wins a seat instead of Kahanov–Kloppert. That is, we engineer an anomaly by making Davidson achieve quota a round earlier. The same three candidates as in the original election survive to the penultimate round, and yet the final two candidates to receive a seat are different. This Aberdeenshire example shows that monotonicity anomalies in multiwinner elections can arise in surprising and interesting ways that would not be anticipated from the study of single-winner STV elections.

Table 12 (Top) The votes-by-round table for the 2017 council election in the ninth ward of the Aberdeenshire council area

Many of the elections demonstrating no-show anomalies, particularly the elections which do not demonstrate any other type of anomaly, display dynamics that are even further from the single-winner case than the Aberdeenshire election discussed above. In the Aberdeenshire example, we create a downward anomaly by changing the round in which Davidson achieves quota. For many of the no-show anomalies, we can engineer the anomaly without changing the round in which any candidate is elected or eliminated, until the round in which a different candidate wins a seat. To see an example, consider the 2022 council election in the seventh ward of the West Lothian council area, the votes-by-round table for which is displayed in the top of Table 13. Note that Paul surpasses quota by 9 votes, 7.08 of which are transferred to fellow Labour candidate Sullivan and 0.04 of which are transferred to Fairbairn. As these numbers suggest, Sullivan is the candidate who benefits most from the election of Paul. The additional 7.08 votes eventually allow Sullivan to win the last seat, as Sullivan defeats Fairbairn by approximately two votes.

If we remove 10 ballots of the form Paul \(\succ \) Fairbairn then the election unfolds as shown in the bottom table of Table 13. The removal of these ballots causes Paul to exceed quota by only one vote, dramatically decreasing the number of votes transferred to Sullivan (Fairbairn, on the other hand, barely feels the difference). As a result, Fairbairn defeats Sullivan by approximately four votes in the penultimate round and replaces Sullivan in the winner set.

The modified election which demonstrates the no-show anomaly unfolds in exactly the same order as the original election; there is no change in the order or elimination of candidates until the penultimate round, when Sullivan is eliminated instead of Fairbairn. This kind of phenomenon is not possible in the single-winner case, where a no-show anomaly only occurs by changing the order of elimination of the candidates. This example illustrates that the multiwinner setting can provide a much richer set of no-show anomaly examples.

Table 13 (Top) The votes-by-round table for the 2022 council election in the seventh ward of the West Lothian council area

Another difference between no-show anomalies in the single-winner case and the multiwinner case is that in the former, it is not possible to make a losing candidate L into a winner by removing only ballots on which L is ranked first. To demonstrate a no-show anomaly when \(S=1\), the removed ballots must have L ranked second or lower. Interestingly, this is no longer the case in the multiwnner setting. To see an example, consider the 2022 election in the fifth ward of the City of Edinburgh Council area. The votes-by-round table for the actual election is displayed in the top of Table 14. Note that Bandel wins a seat while Wood does not. If we remove two bullet votes for Wood then the election unfolds as shown in the bottom of Table 14. Removing these two ballots lowers quota by one and as a result Nicolson surpasses quota by only 0.4 votes when she earns a seat. In the original election Nicolson surpassed quota by 252.8 votes and much more of this surplus was transferred to Bandel than Wood. In the modified election Bandel does not receive this large vote transfer, and therefore Wood is able to win the last seat. It is strange that in the original election Bandel defeats Wood by approximately 46 votes in the penultimate round, and removing two ballots which should be favorable to Wood causes Wood to defeat Bandel by approximately by 32 votes in the penultimate round. This is the only election we found in which we can demonstrate a no-show anomaly by removing ballots on which the affected losing candidate is ranked first on these ballots.

Table 14 (Top) The votes in each round for each candidate in the 2022 election in Ward 5 of the City of Edinburgh Council Area

The primary factors distinguishing the multiwinner STV setting from the single-winner is that a multiwinner election often continues for many rounds after a candidate achieves quota, and the quota in the modified election is lower than in the original election if enough votes are removed. This creates opportunities for engineering anomalies by changing the ballot data so that a winning candidate achieves quota in a different round or surpasses quota by a different amount. Tables 12 and 13 both illustrate this point, and there are many other such examples. The use of a quota introduces a potentially “distortionary” effect into the vote transfer process, where if a candidate is close to earning quota or just barely surpasses quota in a given round then there is a chance to create an anomaly. In the case of no-show anomalies, the use of a quota can create dynamics as observed in Table 13 where a minority group of voters who do not express support for a single party can pay a price for casting their votes.

The multiwinner setting also produces different, higher anomaly rates than in the single-winner case. The most comprehensive empirical study of anomaly rates for single-winner STV elections is Graham-Squire and McCune (2023), which found rates of 2.2%, 1.5%, and 0.5% for upward, downward, and no-show anomalies, respectively, in a large database of American STV elections. However, that study used a database of elections in which no candidate earned an initial majority of first-place votes, which dramatically decreased the denominator used to calculate the reported rates. If we were to include American single-winner STV elections in which a candidate wins in the first round, the reported rates would decrease significantly. Thus, multiwinner STV elections seem to produce much higher anomaly rates than single-winner elections. Perhaps this is to be expected: if there are more winning candidates then there are more opportunities for a candidate to be affected by an upward anomaly, for example.

What is more surprising is that the rates of each anomaly type relative to the others is much different in the multiwinner case than in the single-winner case. Of the five single-winner elections which demonstrate anomalies in Graham-Squire and McCune (2023), there are four upward anomalies, three downward anomalies, and only one no-show anomaly. Five is a small sample size, but these findings suggest that upward anomalies occur at a higher rate than the others, and no-show anomalies are incredibly rare (prior to the work in this article, the only documented no-show anomaly in a real-world election occurred in the August 2022 Special Election for US House in Alaska (Graham-Squire and McCune 2024)). In the multiwinner setting, no-show anomalies occur at a much higher rate than other anomalies. The reason seems to be that we can create examples such as the election in Table 13, where an anomaly is exhibited without changing the order of elimination or election of candidates in previous rounds.

8 Discussion: close elections

In this section we discuss our results through an examination of how frequently anomalies arise in close multiwinner elections, since much of the prior literature focuses on the frequency of monotonicity anomalies in elections that are close in some sense. For example, Moulin (1988) and Ornstein and Norman (2014) examine the single-winner case with \(n=3\), and they define an election to be close if all three candidates receive more than 25% of the first-place votes. Both papers then argue that monotonicity anomalies are much more likely to occur in such close elections.

To build on this literature, we investigate how much closeness matters for monotonicity anomalies in the 1,049 multiwinner Scottish elections. The primary difficulty of such an investigation is that closeness is more difficult to define in the multiwinner setting. We briefly examine several reasonable notions of multiwinner closeness.

If all S winners achieve quota in the first round, we know without examining the ballot data that it is not possible for the election to demonstrate an upward, downward, or no-show anomaly. Such elections are analogous to single-winner elections in which a candidate achieves a majority of votes in the first round, which is a common occurrence in other election databases such as municipal IRV elections in the United States. Our first notion of closeness is that the election does not terminate after only one round, so that not all winners achieve quota initially. Of the 1049 multiwinner elections in the database 1026 satisfy this notion of closeness, and thus it is rare for a Scottish election to terminate in the first round. Using a denominator of 1026 rather than 1049 does not significantly alter the percentages provided in the previous section.

Next, we generalize the notion of closeness found in Moulin (1988), which states that a three-candidate election is close if the candidate with the fewest first-place votes has at least half as many first-place votes as the candidate with the most. Mimicking this idea, we say an election is three-candidate-close with parameter p if there exists a round of the election and a three candidate subset of candidates in this round such that (1) this subset of candidates contains at least one candidate who eventually wins a seat and one candidate who does not win a seat, and (2) the smallest of the vote totals for the three candidates in this round is at least p% of the largest vote total. The black diamonds of Fig. 1 show the percentage of three-candidate-close elections which are anomalous for \(p \in \{50, 51, \dots , 95\}\). For example, when \(p=50\), 865 of the multiwinner elections in the database are three-candidate-close. Of those 865, all 61 of the anomalous elections are three-candidate-close, thus we plot a black diamond at height \(61/865 = 7.1\%\). As p increases the percentage of close elections demonstrating an anomaly also generally increases. When \(p=95\) there are 43 three-candidate-close multiwinner elections, 13 of which demonstrate an anomaly, yielding a percentage of 30.2%. The red disks show similar information but exclude the elections demonstrating a committee size anomaly from the numerator.

Fig. 1
figure 1

For a given p, the percentage of three-candidate-close elections that demonstrate an anomaly. The black diamonds use elections which demonstrate any anomaly, while the red disks exclude elections demonstrating a committee size anomaly

In the popular discourse around elections, often an election is called “close” if one candidate narrowly misses winning a seat in some sense. In an attempt to make this notion rigorous, we say that an election is two-candidate-close with parameter p if there exists a round of the election and a two candidate subset of candidates in this round such that (1) one of the candidates eventually wins a seat and the other does not win a seat, and (2) the smaller of the vote totals of the two candidates in this round is at least p% of the larger. The black diamonds of Fig. 2 show the percentage of two-candidate-close elections which are anomalous for \(p \in \{50, 51, \ldots , 95\}\). As with three-candidate-closeness, as we increase p the percentage of anomalous elections also increases, although we do not obtain percentages nearly as high as the three-candidate-closeness case. There are 346 elections which are two-candidate-close with \(p=95\), fifty-six of which are anomalous, producing a maximal percentage of 16.2%. The red disks show similar information but exclude the elections demonstrating a committee size anomaly from the numerator.

Fig. 2
figure 2

For a given p, the percentage of two-candidate-close elections that demonstrate an anomaly. The black diamonds use elections which demonstrate any anomaly, while the red disks exclude elections demonstrating a committee size anomaly

In the single-winner setting, often elections are described as “close” if a Condorcet winner does not exist or if different voting methods choose different winners. Building on this notion, we restrict to elections which do not contain a set of candidates which generalize the notion of a Condorcet winner, or which produce different winner sets under different multiwinner voting rules.

To generalize the idea of a Condorcet winner, we use the concept of a Condorcet committee as defined in Gehrlein (1985) and Ratliff (2003). A set \(\mathcal {C}\) of S candidates is a Condorcet committee if for each pair of candidates (AB) with \(A \in \mathcal {C}\) and \(B \not \in \mathcal {C}\), more voters prefer A over B than the reverse. When \(S=1\), if a Condorcet committee exists then the single candidate in the set is the election’s Condorcet winner. If such a candidate does not exist, the election can be viewed as especially close or competitive. Of the 1049 multiwinner elections in the database only 18 do not contain a Condorcet committee of size S. Four of these elections demonstrate an anomaly, giving an anomaly rate of 22.2% for these very close elections.

We can also focus on the elections which contain a Condorcet committee and compare different voting methods for those elections. For example, consider the method of single non-transferable vote (SNTV). Under this method, the winner set consists of the S candidates with the most first-place votes. While STV aims to achieve proportional representation, SNTV aims to achieve some level of semi-proportional representation, and electing a Condorcet committee does not attempt to achieve any level of proportional representation. If STV and SNTV both elect the Condorcet committee, in some sense the election is not close. 414 elections satisfy this condition, meaning that from this perspective 635 elections are close. Of these 635 elections, 48 demonstrate some type of anomaly, yielding an anomaly rate of 7.6% for this type of closeness.

There has been no prior theoretical work on closeness and the frequency of monotonicity anomalies for multiwinner STV elections. However, there has been substantial research related to closeness for 3-candidate IRV elections. Our percentages are much lower than what is predicted by Miller (2017) or Ornstein and Norman (2014), both of which give probabilities between 12.5 and 51% for an election to demonstrate an upward or downward anomaly in closely contested single-winner contests, with the highest percentages found for the most competitive elections. Our work confirms prior observations that the closeness of an election matters for the frequency of monotonicity anomalies, and depending on the notion of closeness used we achieve percentages that are consonant with some of the prior theoretical work.

9 Conclusion

The 62 elections demonstrating monotonicity anomalies that we found, including the 53 elections which contain an upward, downward, or no-show anomaly illustrate different results from some prior research. Specifically, these anomalies undermine the claims of Allard (1995), Bradley (1995), and Gallagher (2013), which essentially state that monotonicity anomalies either do not occur in actual STV elections or occur extremely rarely and therefore monotonicity issues are of no practical concern. On the other hand, the anomaly rates we found are not particularly large, unless we restrict to small sets of extremely competitive elections. Our findings suggest that an anomaly of each type should occur about 3–13 times on average per election cycle, which is small but not minuscule compared to the approximately 350 contested STV elections which occur across Scotland in a local government election year. We remind the reader that our results are a lower bound; we cannot guarantee that all anomalous elections have been found and thus more sophisticated code could identify more anomalies, perhaps bringing the anomaly rate more in line with estimates from the single-winner literature. Currently, the problem of deciding whether a given preference profile demonstrates a particular anomaly (besides a committee size anomaly) in an STV election is computationally quite difficult. Identifying necessary and sufficient conditions for each type of multiwinner anomaly, as are currently known for certain single-winner situations, could give an upper bound or a precise number for monotonicity anomalies, and is an interesting avenue for future work.

What does the presence of these anomalies in the Scottish elections say about the use of STV? Does STV’s susceptibility to these anomalies in actual elections imply that STV should not be used? These questions cannot be answered mathematically, as the answers depends on value judgements outside mathematics. If one takes the reasonable position that monotonicity anomalies are offensive enough that methods susceptible to such outcomes should be discarded, then this article is a strong argument against the use of STV. On the other hand, if one feels that STV has benefits which outweigh the low rate of monotonicity anomalies we found in the Scottish data, then this article does not undermine the use of STV. Either way, we make a substantive contribution to the empirical social choice literature by providing the first documented examples of monotonicity anomalies in multiwinner elections and estimating the frequency of such anomalies in real-world STV elections.