Making Votes Count with Internet Voting

This paper reassesses the claim that electronic voting systems help voters to avoid common mistakes that lead to their votes remaining uncounted. While prior studies have come to mixed conclusions, I provide new, more robust evidence based on a case study of extended Internet voting trials in Geneva canton, Switzerland. The trials almost exclusively involved referendum votes. For causal identification I exploit the unique circumstance that federal safety legislation created a near-natural experiment, with some of the canton’s municipalities participating in the trials and others not. Using difference-in-differences estimation, I find that the residual vote rate decreased by an average of 0.3 percentage points if municipalities offered the possibility to vote online in addition to (mostly optically scanned) paper ballots. For cantonal measures, which are located towards the bottom of ballot papers in Geneva, the reduction increases to 0.5 percentage points. These remain relatively modest effects, and I find no evidence for a knock-on effect on electoral outcomes. However, on average only around 20% of votes were cast online where the opportunity existed, and online voting was most popular among voters with high levels of education. Despite the small effect sizes, the results of this study therefore point to the potential of Internet and, more generally, electronic voting technology to reduce avoidable voter mistakes.


Introduction
Electoral turnout, and how to improve it, range among the core concerns in political science. However, more frequently overlooked is the fact that even if citizens participate in elections, their votes do not always enter the final count. Comparative evidence suggests that between 3 and 5% of votes cast remain uncounted (i.e., are ruled blank or invalid) in the average democratic election (Martinez i Coma and Werner 2019; Uggla 2008). In some contexts, such as Latin America, it is not uncommon for the number of uncounted votes to exceed 10% of votes cast (Power and Garand 2007).
Uncounted votes (henceforth also referred to as residual votes) can reflect the actual will of voters. Some voters choose to intentionally spoil their ballots as a form of protest. Others choose to skip some of the less salient races on a ballot . However, residual votes often also result from mistakes on the side of voters. For example, uncertainty about electoral rules can lead voters to vote for more candidates than are allowed under the rules, leading to the invalidation of their votes (Carman et al. 2008). Moreover, voters may accidentally skip races due to oversight, or they may fail to mark ballots in a sufficiently clear way. This paper considers the potential of Internet voting technology to reduce such accidental residual votes.
Accidental residual votes range among the lesser known challenges to democratic legitimacy and the quality of representation. Of course, intentional residual votes are problematic as well because they often reflect voter alienation. However, citizens who accidentally cast residual votes would have wanted to make their voices heard, but failed to do so. Therefore, accidental residual votes contravene a central democratic principle: that the votes of citizens must be accurately interpreted and counted (Dahl 1989, ch. 8). What is more, accidental residual votes are unlikely to be randomly distributed across different populations of voters. Existing evidence suggests that voters with low educational attainment are more prone to errors that lead to their votes not being counted (Bullock and Hood 2002;Fujiwara 2015;Sinclair and Alvarez 2004). Where education levels are correlated with race and ethnicity, such as in the U.S., that can mean that voter errors are disproportionately committed by members of ethnic minorities (Tomz and Van Houweling 2003). By implication, accidental residual votes are likely to reinforce well-known inequalities in representation stemming from unequal participation. In closely fought contests, they may even affect electoral outcomes.
Various proposals have been made for how to reduce accidental residual votes, including better training of polling staff, improved ballot design, and voter education campaigns (Herrnson et al. 2012;Herron and Sekhon 2003;Kimball and Kropf 2005;Niemi and Herrnson 2003). Another prominent proposal concerns error-preventing voting technologies, such as electronic voting systems (Alvarez and Hall 2008). Paper ballots remain the most common voting method in contemporary democracies, more than 400 years after their invention (Reynolds and Steenbergen 2006). But when equipped with a pen and piece of paper, there is often little standing in the way of voters making avoidable errors. While they have grown controversial in recent years due to vulnerability to fraud, an advantage of electronic voting systems is that they can be programmed to prevent voters from making such mistakes. As a result, electronic voting should make it more likely that votes enter the final count and, thus, increase effective participation.
However, empirical studies of whether electronic voting technology lives up to this promise have come to mixed conclusions (e.g., Ansolabehere and Stewart 2005;Stewart 2006). This paper provides new, more robust evidence on the effect of electronic voting technology on residual votes. While prior studies focused on electronic voting machines located in polling stations, this study extends the focus to Internet voting, a novel form of electronic voting that is increasingly discussed and experimented with around the globe (Alvarez et al. 2009). Specifically, I study the case of Geneva canton, Switzerland, where Internet voting (henceforth also referred to as i-voting or online voting) has been trialed on an extended basis from 2003 to 2005 and from 2008 onward. A limitation of the Geneva case is that experimentation with i-voting has largely involved referendums while eschewing elections. However, within these constraints, Geneva canton holds valuable lessons because it enables the estimation of Internet voting's effect on uncounted ballots with comparatively high internal validity. The i-voting roll-out in Geneva canton resembles a natural experiment because, since the very first trials, federal safety legislation has been in place that led to between-municipality variation in the availability of i-voting, with i-voting being offered in some but not other municipalities. In turn, this permits to hold constant many sources of confounding that could have afflicted prior studies, including jurisdiction-specific electoral laws and counting practices as well as varying electoral dynamics.
Using difference-in-differences estimation, I find that the residual vote rate decreased by an average of 0.3 percentage points if municipalities offered the possibility to vote online in addition to (mostly optically scanned) paper ballots. 1 For cantonal (i.e., regional) measures, which in Geneva are located towards the bottom of ballot papers, the effect increases to a minus of 0.5 percentage points. These remain relatively modest effects, and additional analyses suggest that the reduction in uncounted ballots did not have a knock-on effect on electoral outcomes. However, only around 20% of votes tended to be cast online where the opportunity existed. Online voters also tend to have above-average education and, as a group, could thus be less prone to voting errors. Despite the relatively small effect sizes, the results of this study therefore point to the potential of Internet and, more generally, electronic voting technology to reduce avoidable voter mistakes.

Prior Work
The fiasco of the 2000 U.S. presidential election raised public awareness in America and elsewhere that the choice of voting technology is more than a mere technicality. Palm Beach County, Florida, demonstrated to the world that punch card ballots-a form of voting whereby voters punch holes in voting cards with a ballot marking device-are highly vulnerable to human error. As became evident, voters often fail to punch the cards cleanly, leading to the infamous 'hanging chads' that might have swayed the 2000 election to Bush (Brady et al. 2001). Awareness also grew about similar concerns with other common voting methods. For example, the optical scanners that are sometimes used for the counting of write-in ballots may not count a vote if the relevant boxes, bubbles, or arrows have not been marked in a sufficiently clear way (Kimball and Kropf 2005). More generally, voters may misunderstand the rules and, for example, vote for both a presidential and a vice presidential candidate in U.S. elections (Herron and Sekhon 2003).
In 2002, the U.S. government passed the Help America Vote Act (HAVA). As a result, billions of federal and state tax dollars were spent to update older voting technologies. Most election authorities opted for one of two technological innovations: precinct scanners and direct-recording electronic voting machines (DREs). Both, but especially DREs, came with the promise of minimizing "lost votes" due to malfunctioning voting technology and voter mistakes. Precinct scanners allow voters to have their ballot papers checked at the polling station before the casting of their votes. The most advanced implementations report both overvotes (i.e., if voters vote for more candidates or options than are allowed under the rules) and undervotes (i.e., blank votes), thus giving voters a chance to discover and correct potential mistakes (Brady et al. 2001;Kimball and Kropf 2008). However, voters often fail to pre-check their ballots (Hanmer et al. 2010). DREs, by contrast, automatize much of the error checking and prevention process. For example, many DREs do not just alert voters to overvotes, but prevent overvotes altogether. DREs also remove all problems related to unclear marking of the ballot because votes are cast by touching a screen or pressing a button. Similarly to precinct scanners, DREs can also be programmed to alert voters if they are about to skip a race (e.g., with a flashing light) (Alvarez and Hall 2008).
Have these technological fixes worked as intended? The election debacle in 2000 led to a flurry of new research into the relationship between voting technology and residual votes, much of it focused on the U.S. (for a review of this literature cf. Stewart 2011). However, aside from a general consensus that punch card systems often lead to much higher residual vote rates (e.g., Brady et al. 2001;Knack and Kropf 2003;Ansolabehere and Stewart 2005;Kimball and Kropf 2008; though also cf. Lott 2009), the findings from this literature have remained contradictory. To be sure, several studies report evidence that precinct scanners (e.g., Alvarez et al. 2013;Kimball and Kropf 2008) and DREs (e.g., Stewart 2006) tend to reduce residual vote rates, including (for the case of DREs) two studies conducted outside the U.S., one in Brazil (Fujiwara 2015) and the other in the Netherlands (Allers and Kooreman 2009). Particularly promising, Tomz and Van Houweling (2003) reported evidence from two U.S. states, Louisiana and South Carolina, that DREs significantly reduce the gap between African American and white voters in terms of voided ballots. However, other studies point in different directions. In a comprehensive study of the U.S. experience that covers the whole nation and elections from 1988 to 2000, Ansolabehere and Stewart (2005) find that DREs often produce more residual votes than traditional paperbased voting methods, such as hand-counted write-in ballots. Knack and Kropf (2003) come to a similar conclusion in their study of the 1996 U.S. presidential election, whereas Brady et al. (2001) find that DREs performed comparably to, but did not outperform, most paper-based voting systems in the 2000 U.S. presidential election (also cf. Lott 2009). Regarding precinct scanners, both Knack and Kropf (2003, fn. 21) and Tomz and Van Houweling (2003, p. 56) report no significant differences in residual votes when comparing them with centrally counted optical scanning systems that do not provide voters with the possibility to check their ballots.
These disparate findings can be partly reconciled when considering that precinct scanners and DREs are not all born the same. Not all versions of precinct scanners alert voters to undervotes, which could decrease their performance (Miller 2013). At the same time, especially older DREs tended to have usability issues, which could explain why some studies found few, if any, improvements over paper-based voting technologies (Stewart 2006). Kimball and Kropf (2008), for example, show evidence from the 2004 election in the U.S. that full-face DREs that display all ballots at once on sometimes massive screens can overwhelm voters and make it more rather than less likely that they miss down-ballot contests. In addition, the performance of voting technologies is likely to differ depending on context factors, such as the complexity of electoral rules, the design of paper ballots, and levels of education.
However, many prior studies also suffer from limitations that render their conclusions uncertain (Stewart 2011). Especially earlier studies often relied on crosssectional variation in voting technology while accounting for potential confounders by controlling for factors such as the size and average income of electoral districts (e.g., Kimball and Kropf 2008;Knack and Kropf 2003;Tomz and Van Houweling 2003). However, this risks confounding the effects of voting technology with other differences across jurisdictions that are more difficult to measure, such as the sophistication of voters, election laws, or levels of voter disaffection. Other studies have relied on more sophisticated panel data designs (e.g., Ansolabehere and Stewart 2005), which allow to control for cross-sectional differences via fixed effects estimation. However, a problem that remains is that these studies tend to make comparisons across different elections with potentially different dynamics, such as presidential races or ballot measures in different states. This risks conflating the effects of voting technology with election-specific factors, such as the closeness of races or their perceived importance (cf. Keele and Minozzi 2013). Therefore, more (and better) evidence on the performance of different voting technologies is needed.

Internet Voting and Residual Votes
This study sheds new light on the causal effect of electronic voting technology on residual votes based on a case study of Geneva canton. While prior studies have focused on DREs, this study extends the focus to a different form of electronic voting: Internet voting. From the perspective of residual votes, DREs and i-voting systems are not too different, as error-prevention mechanisms available in the context of DREs are often easily extendable to online voting. However, whereas DREs are for voting in polling stations, i-voting extends error-prevention mechanisms to remote voters. The remainder of this section provides a short overview of the i-voting trials in Geneva canton, including a discussion of the ways in which Geneva's i-voting solution could have reduced the residual vote rate.

Geneva's Internet Voting Trials
Geneva canton ranges among the Internet voting pioneers. In 2002, the Swiss government decided to enable trials with online voting in selected cantons. Geneva took up the challenge, along with two other cantons (Neuchâtel and Zurich). In 2003, Geneva staged Switzerland's first i-voting experiment in the context of a locallevel referendum in Anières, one of its smaller municipalities. The following year, Anières and three other municipalities first used i-voting for federal (i.e., national) and cantonal (i.e., regional) referendums. I-voting experiments continued in subsequent years in these and other municipalities. The only exception was the period between June 2005 and June 2008, when Geneva's i-voting program was temporarily suspended because of the need to establish a firmer legal basis. In 2009, online voting was extended to Swiss expatriates registered in Geneva canton, an option that has remained available to them since (Pammett and Goodman 2013;. Against initial expectations, Internet voting proved only moderately popular with Geneva's voters. As in most other i-voting experiments (e.g., in Estonia), online ballots were always offered as a complement to paper ballots in Geneva canton. However, more unusually, voters in Geneva always also had the option to vote by mail. Voting materials were always automatically sent to voters around 4 weeks prior to an electoral contest, and they could then choose whether to take their ballot papers to a polling station, whether to return them by mail, or, where that opportunity existed, whether to cast their votes online. Perhaps due to this highly convenient range of choices, on average only around 20% of votes were cast online where the opportunity existed (Mendez and Serdült 2017;. The only major exception were expatriate voters, 50% and more of whom tended to make use of the possibility to vote online (see Fig. 1 and, for further details, Germann and Serdült 2014).
Against the expectations of many, available evidence also suggests that the introduction of i-voting had no effect on electoral turnout (Germann and Serdült 2017). Again, a plausible reason is the availability of postal voting. Voting in Geneva is rather convenient even in the absence of online ballots. Where voters cannot also vote by mail, such as in Estonia or Canada, extant evidence suggests that online voting can provide a boost to turnout (Alvarez et al. 2009 Goodman and Stokes, forthcoming).

Electronic Safeguards Against Accidental Residual Votes
However, even if i-voting did not affect participation rates in Geneva canton, by reducing voter errors it might still have increased the effective turnout in terms of the number of valid votes cast. Next, I identify the safeguards implemented in Geneva's i-voting solution that could have helped voters to avoid accidental residual votes. Given that Geneva's i-voting trials were generally limited to direct democratic votes, the discussion (as the empirical analysis that follows) focuses on the case of referendum ballots. 2 Throughout the period analyzed (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016), the same referendum ballots featuring 'yes' and 'no' boxes next to every measure were used in Geneva for both postal and the traditional ballot box voting. In order to cast a valid vote, voters had to check one of these boxes with a cross. Mail ballots were then counted centrally with optical scanners, whereas votes cast in polling stations were counted by hand in the individual polling stations. 3 As mail ballots were far more popular-often more than 90% or even 95% of paper ballots were returned by mail-most paper ballots were scanned. Neither mail nor precinct voters had access to any kind of technology, such as precinct scanners, that would allow them to check their marked ballots for errors.
Geneva's i-voting system improves upon this system in two ways. First, it prevents accidental overvotes. In the context of referendum votes, the danger that voters erroneously conclude that they are supposed to check both the yes and no boxes is likely to be small. However, it is possible that voters leave stray marks covering the second box, which can prevent the optical scanners from deciphering a voter's intention, leading to the invalidation of the vote. 4 Similarly, voters may come to realize that they did not give the answer they intended to give, leading them to check the second box while attempting to strike through (or erase) their original answer, which is not allowed. Geneva's i-voting system prevents such unintentional overvotes because voters can choose their favored answer from a drop-down menu, which makes choices correctable and ensures that voters give one answer only.
Second, Geneva's i-voting system is likely to reduce accidental undervotes. I-voting precludes undervotes that result from voters placing their checks outside of the relevant boxes. I-voting also precludes undervotes that result from voters using a red-colored pen, which cannot be read by Geneva's optical scanners. Finally, and perhaps most importantly, the i-voting software makes it less likely that voters skip a race by accident. It is not uncommon in Switzerland that voters have to vote on 5, 10, or even more ballot measures at the same time (Selb 2008). When ballots are as crowded, the chance that voters unintentionally skip one or more of the proposals is likely to increase. While it is still possible to cast a blank vote with i-voting, it becomes less likely that voters do so unintentionally because online voters are shown a confirmation screen after completing their ballot. The confirmation screen prompts voters to review their choices before the final submission. Thus, undervoting becomes more transparent and correctable.

Empirical Strategy
Could these safeguards reduce the residual vote rate? The Geneva case enables a comparatively robust answer to this question because of its staggered adoption of i-voting over time and space.
Ever since the very first i-voting trials in Geneva canton, federal safety legislation has been in place limiting the number of voters who can participate in i-voting trials.
The goal of this law has been to reduce the risk in terms of electoral manipulation. Initially, the safety legislation stipulated that not more than 20% of all voters in a canton should have access to online voting. In 2012, the cap was increased to 30% of a canton's electorate. 5 To conform with this legislation, Geneva's electoral administrators decided to trial i-voting only in selected municipalities. This way, they could ensure that the federally imposed cap would never be surpassed. There was some turnover over time, with some municipalities dropping out of the trials and others joining. But i-voting was always offered in some municipalities and not others. Notably, all voters in trial municipalities had the option of voting online. Trial municipalities were selected so that the federal cap would still be met if that were to happen. Of course, in practice only a minority of voters tended to make use of the possibility (see above), so that the actual number of online voters was consistently far below the federal cap. Importantly, trial municipalities were not selected randomly, but election officials sought to balance trial and non-trial municipalities on key socio-demographics (Germann and Serdült 2017). Table 1 shows that this strategy was partially (but not fully) successful when it comes to population size, age, education levels, and per capita income. However, even if trial and non-trial municipalities are not fully balanced, the significant upshot of the way in which Geneva responded to the federal safety legislation is that there is both over-time and between-municipality variation in the availability of i-voting. As a result, municipalities from the same political unit, some of which had i-voting whereas others did not, can be compared while they simultaneously voted on the same issues. This set-up facilitates difference-in-differences estimation and, therefore, to straightforwardly account for any cross-sectional imbalances between trial and non-trial municipalities, such as, at least approximately, differences in income or education. Moreover, difference-in-differences estimation automatically takes care of any confounder that is specific to voting days or even ballot proposals, such as varying levels of voter disaffection or interest in referendum proposals. Beyond this, electoral laws apply uniformly across Geneva canton, thus ruling out bias due to variation in electoral legislation; and causal inference is facilitated further by Geneva's relatively high social homogeneity (Geneva is commonly referred to as a "city canton" and 80% of its municipalities are counted as urban or peri-urban by the Federal Statistics Office).
Overall, the case set-up in combination with difference-in-differences estimation make it possible to rule out most possible confounders by design. However, difference-in-differences estimates are valid only if the parallel trends assumption is met (Angrist and Pischke 2009). In the present case, this means that under the counterfactual scenario where i-voting would have never been introduced, the residual vote rate should have evolved in parallel in treated (with i-voting) and control (without i-voting) municipalities. Further below I provide indirect evidence in favor of the parallel trends assumption. At the same time, I relax the parallel trends assumption via the inclusion of municipality-level time trends. Municipality-level time trends can increase our confidence that smooth changes in, for example, the socioeconomic composition of municipalities do not bias the causal estimates. Moreover, all models control for electoral participation. While existing evidence suggests that i-voting had no measurable effect on turnout in Geneva canton (Germann and Serdült 2017), controlling for participation ensures that the estimated effects are independent of any small changes in turnout rates that cannot be safely distinguished from zero. In the robustness section I show that controlling for additional time-varying confounders does not affect the results.
Four further remarks are in order before turning to the empirical analysis. First, prior to 2001 Geneva canton used a different paper voting system whereby voters had to write 'yes' or 'no' next to referendum proposals rather than checking boxes. Also, prior to 2001 all ballots were counted manually, including mail ballots. Therefore, all analyses reported below focus on the period from 2001 onward.
Second, in late 2016 Geneva switched to a system whereby voters from all municipalities can i-vote if they register for an online ballot. To comply with federal safety regulations, registrations are capped at 30% of the canton's electorate. Unfortunately, this implies that there is no longer between-municipality variation in i-voting availability. Given the absence of suitable control units, the analysis stops in September 2016, after which the new system was introduced.
Third, the federal safety regulation applies only to federal votes. Therefore, Geneva was in principle free to offer i-voting on a broader basis for its own, canton-level electoral contests. However, in practice cantonal referendums tend to be scheduled simultaneously with federal referendums, to profit from the latter's higher turnout and save costs. As it would be impractical to offer i-voting for some but not other votes on the same ballot, the 20/30% cap therefore implicitly also applied to most cantonal referendums. There were only three exceptions to this general rule during the period studied. In May 2011, November 2011, and October 2012, no simultaneous federal referendums were scheduled and Geneva canton therefore allowed all its citizens to vote online in cantonal referendums. Given the lack of within-canton variation, there are no plausible counterfactuals in these cases. Therefore, I exclude all 11 cantonal referendums voted on these three dates. All other cantonal referendums are included.
Finally, as there are no plausible counterfactuals for municipal contests, I exclude all municipal referendums, even if they were voted simultaneously with federal referendums.

Main Results
I proceed to estimate the causal effect of the availability of i-voting in a municipality on the number of residual votes. The sample covers all cantonal and federal referendums voted in Geneva canton between 2001 and September 2016, except for the aforementioned 11 cantonal referendums when i-voting was made available in all municipalities. Overall, the sample includes 284 referendums voted on 53 separate dates. The unit of analysis is a municipality voting on a referendum proposal. All 45 municipalities in Geneva canton are included. In addition, I include expatriate voters as a separate, artificial 46th municipality. The results remain similar when expatriate voters are dropped (see Table S1 in the Online Supplement).
The dependent variable is the residual vote rate, defined as the percentage of votes that do not enter the final count relative to the total number of votes cast. 6 The average residual vote rate in the sample is 5.3%. 7 The central explanatory variable is the availability of i-voting in a municipality. I-voting was available on 28 of the 53 voting days (156 of 284 referendums) in at least four and a maximum of 18 municipalities during the period studied (see Fig. 2 for breakdowns by municipality).
All estimates are based on OLS regression with municipality and referendum fixed effects. Standard errors are clustered by municipality to account for time 6 Geneva's electoral statistics do not clearly separate under-from overvotes. While undervotes always figure as 'blank', overvotes can figure as both 'blank' or 'invalid', depending on the circumstances (cf. articles 64 and 65A of Geneva's 'loi sur l'exercice des droits politiques (LEDP) [law on the exercise of political rights]'). Unfortunately, this makes it impossible to determine the extent to which the aggregate effects reported below are due to reductions in over-and undervotes, respectively. 7 Note that this figure includes a considerable number of intentional undervotes. According to post-referendum surveys (Kriesi et al. 2018), on average around 3% of Swiss voters intentionally cast a blank vote on a ballot proposal during the period under study. The most likely reason is lack of interest in some of the less salient proposals on a ballot. Usually less than 0.5% of Swiss voters reported to have cast a blank vote on each and every proposal on the ballot. The corresponding figures for Geneva canton cannot be established due to sample restrictions. dependence. Two-way fixed effects regression constitutes a generalization of difference-in-differences estimation for multiple time periods (Angrist and Pischke 2009, pp. 233-241). The causal effect of the availability of i-voting is estimated  solely based on within-municipality variation in the availability of i-voting and the residual vote rate. All models include quadratic municipality-level time trends and control for electoral participation. 8 The results suggest that Internet voting constitutes an effective method to reduce voter errors and uncounted ballots. As argued above, Geneva's i-voting solution prevents overvotes and decreases the potential for unintentional undervotes (among other things because of the confirmation screen). Model 1 in Table 2 suggests that as a result of this, the residual vote rate decreased by an average of 0.32 percentage points if a municipality offered the possibility to vote online ( p < 0.001 ). To get a better grasp of the magnitude of this effect, I compare the point estimate to the counterfactual numbers of residual votes had i-voting never been made available. This suggests that i-voting decreased the average residual vote rate from 5.4 to 5.1% in municipalities with i-voting, which corresponds to a 6% decrease. Expressed differently, i-voting increased the share of valid votes in an average referendum by 0.3%, from 94.6 to 94.9%. While not earth-shattering, an 0.3% increase in the number of valid votes-and, therefore, effective turnout-is notable in light of the fact that only around a fifth of voters tended to make use of online voting where that possibility existed. Also, survey evidence suggests that Geneva's online voters tended to have comparatively high levels of education. In particular, online voters were almost twice as likely to have a university degree compared to paper voters (Sciarini et al. 2013, p. 48). Therefore, Geneva's online voters constitute a demographic that, a priori, should be less susceptible to voter errors. 9 Additional analyses reported in models 2 and 3 in Table 2 suggest that electronic safeguards against residual votes are especially important when it comes to cantonal (and, thus, down-ballot) measures. Specifically, I find that whereas i-voting decreased the residual vote rate by a mere quarter of a percentage point in federal referendums ( p = 0.04 ), the reduction amounts to almost half a percentage point in cantonal referendums ( p < 0.001 ). The most plausible explanation for this finding is the extra nudge to voters provided by the confirmation screen to review their choices before the final submission. Cantonal referendums may be more prone to oversight and accidental undervotes, first, because they are often seen as 'second order' and tend to receive less media attention (Buetzer 2011), and second, because they are placed towards the bottom of ballot papers. 10 Therefore, asking voters to confirm their choices may be more important when it comes to cantonal referendums. 8 The results remain almost identical when dropping turnout from the list of controls (see Table S2 in the Online Supplement). 9 Internet voters in Geneva also tended to be younger than the average voter (Sciarini et al. 2013, p. 29;, but age is a less likely determinant of accidental residual votes. 10 Conforming with these arguments, cantonal referendums have a higher residual vote rate compared to federal referendums (3.6% versus 6.9%). However, note that voters are also more likely to selectively abstain from cantonal referendums due to lack of interest or higher information costs.

Causal Identification Assumption and Robustness Checks
The central causal identification assumption is that the residual vote rate would have evolved in parallel in municipalities with and without Internet voting had Internet voting never been introduced. To assess the plausibility of the parallel trends assumption, I consider the evolution of the residual vote rate in the period before the first i-voting trial in September 2004. Figure 3 shows annual averages of the number of residual votes from 2001 to mid-2004 by treatment status. Municipalities with at least one i-voting trial in subsequent years are assigned to the treatment group. All others are assigned to the control group. As becomes evident, the residual vote rate follows a very similar trajectory in treated and control municipalities before the first i-voting trial. This is of course no direct test of the parallel trends assumption, which relates to a counterfactual and is therefore fundamentally untestable. However, evidence for parallel pre-treatment trends makes it plausible that the residual vote rate would have developed in parallel also after September 2004 had i-voting never been introduced, and hence that the reduction in residual votes attributed to i-voting has causal interpretation.
I provide additional, statistically-based evidence for the parallelism of pretreatment trends based on a placebo test. To this purpose, I define a placebo treatment and code it 1 for all referendums voted on the 3 voting days before the first i-voting trial in a municipality, where applicable. I then estimate an analogous two-way fixed effects models including both the indicator for actual i-voting availability and the placebo treatment. As would be expected if pre-treatment trends are parallel, model 1 in Table 3 shows that the placebo treatment has no statistically significant effect on the residual vote rate ( p = 0.20).
To further probe the internal validity of the results, I re-estimate the model while accounting for several additional municipality-level covariates that could plausibly affect the number of residual votes: per capita income, unemployment rate, population size (logged), age group shares (20-34; 35-49; 50-65; and 65+), the share of non-Swiss nationals, and vote shares for left-wing parties in the previous national election (lower chamber). 11 All data is drawn from official statistics. Data on the average income in municipalities is available only until and including 2015, so all referendums voted in 2016 are now dropped. Moreover, expatriate voters are no longer included because data on several covariates are unavailable (e.g., unemployment rate) or do not apply (e.g., share of foreigners). Further improving confidence in the estimated effect, model 2 in Table 3 shows that the coefficient for i-voting remains almost identical in both size (− 0.32) and statistical significance ( p = 0.002 ). Similar conclusions are reached for this as well as all other robustness checks when disaggregating the sample into federal and cantonal referendums (see Tables S3 and S4 in the Online Supplement). Table 3 Robustness checks The table shows coefficients from OLS regressions including municipality and referendum fixed effects as well as quadratic municipality-level time trends. All models control for electoral turnout. Standard errors clustered by municipality (and, in model 4, by municipality and voting day) are shown in parentheses and p-values in square brackets. All models include both cantonal and federal referendums

Model
(1) Unfortunately, municipal-level data on education levels is unavailable beyond the year 2000 due to a change in census data collection practices. However, the control for per capita income should at least partly make up for this. It is also worth noting that per capita income is far from statistically significant (p = 0.38).
Next, I re-estimate the treatment effect while dropping cases with simultaneous municipal elections or referendums (see model 3 in Table 3). Simultaneous municipal electoral contests are rare-less than 1.5% of observations are concerned-and most municipal electoral contests are low-key affairs, similarly to cantonal referendums. Still, some municipal contests may mobilize additional voters, leading to potential violations of the parallel trends assumption. Reassuringly, the effect of i-voting on the residual vote rate remains virtually unchanged when dropping observations with simultaneous municipal contests.
Finally, I re-estimate the treatment effect while clustering standard errors by municipality and voting day. This adjusts standard errors for contemporaneous dependence among referendums voted on the same day, in addition to time dependence (Cameron et al. 2011). I find that the variance estimate remains similar ( p = 0.005 ) (see model 4 in Table 3), including when estimating separate models for cantonal ( p = 0.001 ) and federal ( p = 0.08 ) referendums (see Tables S3 and S4 in the Online Supplement).

Indirect Effect on Electoral Outcomes?
Having established the robustness of the causal estimate, I finally turn to the question whether the reduction in the residual vote rate affected the outcomes of referendums in Geneva canton. As noted in the introduction, prior evidence suggests that voting errors are disproportionately committed by voters with low educational attainment (Bullock and Hood 2002;Tomz and Van Houweling 2003). Therefore, an important hope associated with error-reducing voting technologies has been that they reduce education-based disparities in uncounted ballots and can thereby improve the representation of less educated voters (Knack and Kropf 2003;Tomz and Van Houweling 2003). In line with this, Fujiwara (2015) found evidence that the introduction of DREs in Brazil increased electoral support for policy proposals that directly benefit voters with low education, such as economic redistribution and a strong welfare state, as well as support for left-wing parties more generally. However, whether such findings generalize to the case of Internet voting is not clear, especially as online voting is less frequently used by voters with low levels of education .
To investigate potential knock-on effects on electoral outcomes I repeat the estimation set-up from above while switching the dependent variable from the residual vote rate to direct-democratic policy choices made by Geneva's voters. As above, all models control for electoral turnout so as to make sure that the measured effect of the availability of i-voting reflects the implications of the reduction in uncounted ballots. 12 Table 4 shows the results.
I consider a total of four dependent variables. The first two enable me to investigate whether the reduction in uncounted ballots shifted referendum outcomes systematically to the left (or right). These measures leverage the fact that parties in Switzerland almost always publish voting recommendations prior to referendums. Using this information I coded two variables that, respectively, record the share of valid votes cast in favor of the options recommended by the Socialists (the largest left-wing party in Geneva canton) and the Greens (the second-largest left-wing party). For example, the first variable records the vote share in favor of a proposal if the Socialists recommended a 'yes' vote, and the no share if the Socialists recommended a 'no' vote. Models 1 and 2 in Table 4 show that the introduction of i-voting, conditional on turnout, had no statistically significant effect on levels of voter support for policy proposals favored by left-wing parties ( p = 0.45 and 0.92, respectively). This suggests that the reduction in residual votes did not pull electoral outcomes towards the left (or the right).
Next, I analyze whether i-voting more specifically affected voter support for economic redistribution. To test this I manually identified referendums with direct implications for the level of economic redistribution from richer to poorer citizens. I was able to identify 66 relevant referendums. 13 I then used these 66 ballot proposals to code a variable recording the share of valid votes for increased redistribution. The variable corresponds to the yes share if a proposal would increase redistribution (e.g., a 2014 proposal for a federal minimum wage) and the no share for proposals that would lower redistribution (e.g., a 2002 proposal to cut unemployment benefits). Again, I find no evidence for an indirect effect on referendum outcomes (see model 3 in Table 4). As elsewhere in Western Europe, the working class in Switzerland has increasingly turned to right-populist parties advocating culturally conservative policy proposals (Oesch 2012). As a final step, I therefore consider whether i-voting affected support for nativist policy proposals, such as a 2014 proposal to cut "mass immigration", as well as other proposals generally associated with the cultural conservatism to cultural liberalism continuum, such as European integration, law and order, gay rights, and abortion (cf. Kriesi et al. 2006). I found a total of 38 relevant referendums during the period studied, and coded an analogous variable recording the share of valid votes in favor of cultural conservatism. 14 This variable records the yes share if a proposal would lead to more cultural conservatism (e.g., the previously mentioned proposal to cut immigration) and the no share if a proposal would lead to less cultural conservatism (e.g., a 2005 proposal to grant foreigners the right to vote  Table S13 in the Online Supplement lists all referendums used to measure 'support cultural conservatism'. in municipal elections). Again, I find no evidence that the reduction in residual votes affected referendum outcomes (see model 4 in Table 4). Figure 4 plots average vote shares for left-wing parties as well as support for economic redistribution and cultural conservatism prior to the first i-voting trial. It becomes visible that the different dependent variables evolved approximately in parallel in treated and control municipalities. 15 All four models also pass placebo treatment checks (see Table S7 in the Online Supplement). 16 Analogously to above, this can be seen as supporting the parallel trends assumption and, hence, that the effect estimates have causal interpretation. Tables S8 and S9 in the Online Supplement show that the results remain unchanged when controlling for additional timevarying covariates and when dropping observations with simultaneous municipal contests. Finally, I also find no effects on electoral outcomes when disaggregating the sample into cantonal and federal referendums (see Tables S10 and S11 in the Online Supplement). This rules out an indirect effect that is visible only for cantonal referendums, which tend to see larger reductions in the residual vote rate.
Overall, these results suggest that i-voting's reduced error potential remained without systematic implications for the outcomes of referendums and, in particular, that the introduction of i-voting did not improve the policy representation of voters with low education. As noted, a possible explanation is that online voting is most popular with highly educated voters, which could offset any effect on the representation of less educated voters. For such an effect to materialize, more voters with low education may have to make use of the opportunity to vote online. That said, it is important to note that the estimated coefficients represent average effects. One cannot, therefore, infer that the reduced error rate did not affect the outcome of any given referendum.

Conclusion
Uncounted ballots rarely make headlines. But the number of votes that are discarded from the tallying of final results can run into the tens of thousands and, in large polities, the hundreds of thousands. This study presented new evidence that Internet voting systems can prevent voters from accidentally casting residual votes. The estimated effects-an 0.25 percentage points reduction in the residual vote rate for federal referendums and almost 0.5 percentage points for cantonal referendumsmay seem modest and some prior studies, if often based on less robust research designs, reported larger effects for similar comparisons. Stewart (2006), for example, presented evidence that U.S. counties that switched from central count optical scan systems to DREs after the 2000 election reported an 0.7 percentage points reduction in the residual vote rate in the 2004 presidential election. Extending the 15 As above, municipalities are considered 'treated' if its citizens had the opportunity to i-vote at least once after September 2004, and as 'control' if citizens never could i-vote. 16 As above, the placebo treatments are coded 1 for the 3 voting days before the first i-voting trial in a municipality.
focus to referendum ballots, Kimball and Kropf (2008) report a similar effect size for touch-screen DREs. 17 However, the estimates from these studies refer to situations in which most, if not all, voters made use of electronic voting. By contrast, the estimates from this study refer to a situation in which only a fifth of votes were cast electronically (i.e., online), and the rest by paper.
Therefore, despite the relatively small effect sizes, this research suggests that electronic voting systems provide an effective remedy against common mistakes made by voters. What is more, this study also points to a way how error-reducing voting technology can be extended to remote voters. Countries around the globe increasingly allow voters to cast their votes remotely, most typically by mail (Gronke et al. 2008). However, only voters who frequent a polling station can profit from error-reducing technologies considered in prior studies, including DREs and precinct scanners (Alvarez et al. 2013). A unique benefit of Internet voting is that it allows the extension of technological safeguards against accidental residual votes to remote voters.
Nevertheless, the results of this study cannot, and should not, be read as a blanket recommendation of Internet voting (or electronic voting more generally). One of Internet voting's weak points is that it tends to appeal more strongly to more educated (and younger) citizens. While that may change in the future (Vassil et al. 2016), all voters should be able to profit from safeguards against errors. If a polity wants to enable remote voting, that could, for example, suggest a combination of Internet voting with DREs. However, proneness to accidental residual votes is not the only criterion by which voting technologies should be judged. There are, in particular, increasingly concerns about the vulnerability of electronic voting systems to third party manipulation and consequent risks to the integrity of elections. As decisions on voting technology should not be based on their proneness to residual votes alone, no clear recommendations can follow from studies of residual votes, such as this one.
At the same time, this remains a single case study. Case constraints prevented me from analyzing residual votes in elections. Most paper votes in Geneva canton are also counted with optical scanners while in other contexts paper ballots are handcounted. Based on a priori reasoning, there seem to be few reasons why the results of this study should not generalize to elections and to manual counting. Many kinds of elections have decidedly more complex rules than yes/no referendums and, therefore, more potential for voter error. And, while some of the issues that can emerge with optical scanners, such as stray marks, may be less of a problem if ballots are hand-counted, paper ballots remain difficult to correct; and irrespective of the counting method cannot alert voters to undervotes or prevent overvoting. Nevertheless, to move beyond conjecturing, more empirical evidence from other case contexts is needed.
Beyond cross-validation, an important contribution of future studies could be to identify the most effective technological safeguards against residual votes. For example, Geneva's online voting system prompts voters to review their choices via the inclusion of a confirmation screen. Could accidental residual votes be reduced more effectively if voters were in addition given explicit reminders that they are about to undervote? And what if they were shown contests sequentially rather than all on the same screen, as is the case in Geneva canton? Finally, we also need more comparative evidence on the performance of electronic voting systems relative to other measures against accidental residual voting. For example, are electronic voting systems or precinct scanners more effective in reducing residual votes? Or, are improvements in paper ballot design or voter education campaigns more effective responses? While the existing literature has few answers to these questions, providing them will prove useful to electoral administrators around the world.