Time, Privacy, Robustness, Accuracy: Trade-Offs for the Open Vote Network Protocol

.


Introduction
Cryptographic voting protocols allow mutually-distrusting entities to verifiably compute a voting result without revealing more about the private vote inputs than the actual result.Most of these protocols involve a trusted authority responsible for running the election or tallying the results.However, there exist a number of so-called "boardroom" or "self-tallying" schemes that do away with the need for a central authority [13].In such decentralised schemes, the election is an interactive protocol between the voters only and it can even be made one-round, i.e. noninteractive, in a public key setting [7].Whether a centralised or decentralised protocol is better-suited to a given situation depends on practical and contextspecific concerns such as whether the trusted authority assumption makes sense.Especially, the decentralised protocol can be used in settings where there is no natural trusted third party, e.g., a company surveying privacy-sensitive data of the customers.
The open vote network (OV-Net) is a self-tallying voting scheme proposed by Hao, Ryan and Zieliński [10].Improving upon Hao and Zielıński's earlier AV-net [11,9], it is a 2-round protocol which makes it an appealing candidate for largerscale elections. 7One of OV-Net's limitations, according to Hao-Ryan-Zieliński, is that the protocol cannot handle denial-of-service (DoS) events: " (...) For example, if some voters refuse to send data in round 2, the tallying process will fail.This kind of attack is overt; everyone will know who the attackers are.To rectify this, voters need to expel the disrupters and restart the protocol; their privacy remains intact.However the voting process would be delayed, which may prove costly for large scale (countrywide) elections (...)" -[10, Sec 3.4] While the protection of privacy and the identification of culprits are desirable properties, the need to restart the protocol every time a voter drops out is a very strong limitation.This weakness is what we set out to rectify in this paper, by extending OV-Net to handle DoS events gracefully using parallel elections.Our modifications come at a cost, which we investigate quantitatively.Some earlier works have already tried to improve the security and efficiency of OV-Net.In [12] fairness (i.e.preventing that voters get partial results before casting their vote) was guaranteed by committing to the vote in the first round.Further, the robustness against denial of service attacks was improved by introducing a recovery round: if some voters did not participate in the second round, the remaining voters perform a third round to achieve the partial tally for their cast votes.However this does not guarantee that there are no fallouts in the recovery round.In [7] it was shown that using a bilinear group setting and assuming a public key infrastructure, the voting protocol can be made noninteractive, i.e. one-round.This decreases the run time considerably, but does not in itself remove the robustness problem since the list of voters has to be determined before the election and the result cannot be computed without every eligible voter participating.Finally, in [15] the OV-Net was implemented via a smart contract that financially punishes voters who drops out of the election.This gives an economic incentive to participate in the second round, but does not prevent dedicated DoS attacks, nor involuntary dropouts e.g.due to lack of network access, and it assumes that the participants are willing to risk the economic punishment in the first place.

Notations
Throughout this paper, we will use the following notations.If X is a finite set, x $ ← − X means that x is sampled uniformly at random from X.When working in a cyclic group G generated by g, we write [x] to denote g x .If q > 1 is an integer, we denote by Z q := Z/qZ the ring of integers modulo q.We denote by 1 the vector whose coordinates are all 1.BD(p, n) denotes the binomial distribution of mean p for a population n.
Note that due to the page limit a longer version of paper including proofs of the obtained results and appendices can be accessed here [1].

Open vote network (OV-Net)
We recall here the OV-Net protocol in the simple case of a referendum: there are two vote choices encoded as 0 or 1 and n voters; each voter will cast a vote v i ∈ {0, 1} and the final tally will reveal the sum of all votes.Ultimately, we may set a threshold to choose a final winner based on the tally, but this is beyond the scope of OV-Net.
We assume that all participants have agreed ahead of time to use a given cyclic group G of generator g in which the decisional Diffie-Hellman problem is intractable.Let q be the order of G.Each voter i ∈ {1, . . ., n} samples a random value x i $ ← − Z q as a secret.
At the end of this procedure, each voter checks the proof of knowledge of all others, and multiplies together all the g xiyi g vi 's.Since i x i y i = 0 by the definition of y i , the result is g n i=1 vi , from which the value n i=1 v i can be recovered by solving the discrete logarithm problem in G -this is tractable because n is small (by cryptographic standards), with the total world population being less than 2 34 .Thus generic algorithms such as Pollard's ρ, with a complexity of O( √ q), can be used here.
Remark 1.The OV-Net protocol can be extended to more than two candidates by an appropriate encoding of v i [6,2], with the final tally requiring a (superincreasing) knapsack resolution after a discrete logarithm computation [10, Sec.2.2].
Here we focus on the simpler case of two candidates.

Denial of Service
In the description of OV-Net, we implicitly assume that all participants are honest, to the extent that the proofs of knowledge are valid and that they follow the protocol.If one or several voters publish an incorrect proof of knowledge, or do not follow through with the protocol, then it is impossible to reach a conclusion for this particular vote event.This is called a denial of service (DoS) event.
When a DoS event occurs, the non-compliant voters can be identified and removed from a subsequent vote.However the results for that particular vote must be discarded (or cannot be computed) and a fresh vote must take place.This is troublesome for several reasons.One reason is that as n becomes large, disconnection or time-out events become more common and therefore the protocol's failure probability increases.Another reason is that accounting for protocol errors and re-voting adds complexity to real-world OV-Net implementations.

Parallel OV-Net
We consider a modification of OV-Net where users participate in several voting sessions in parallel.However, not all voters take part to all votes, as we now explain.Let n be the number of voters and M the number of parallel vote sessions.Each voter will participate in k pseudo-randomly chosen sessions amongst M .
More precisely, voter i picks k sessions before the protocol is run which we call i's selection.We assume that this selection is pseudo-random, i.e. that any given selection happens with the same probability 1/ M k .As a result not all sessions have the same number of voters, a phenomenon that we will need to account for.Remark 2. A natural question is whether we could impose a more clever rule, that would guarantee that there is always the same number of voting opportunities for each of them.Indeed, a solution is provided, in some cases, by Steiner systems [3]: a Steiner system with parameters t, k, n, written S(t, k, n), is an n-element set S together with a set of k-element subsets of S (called blocks) with the property that each t-element subset of S is contained in exactly one block.
The existence of Steiner systems is deeply connected to number-theoretic properties, and in particular the existence of a S(t, k, n + 1) system precludes that of a S(t, k, n).Thus, although we could initially form a balanced set of voters in some initial setting, it cannot be done if any of the voters bails out (or is disconnected).
However, it is not obvious how a decentralised pool of voters could agree on such a setting in a non mutually-trusting way and without leaking private information.It also remains an interesting question whether approximately balanced block designs exist that are "stable" in the sense that they retain this property when elements are removed.
Should a voter drop out during a voting session, this particular session will be discarded, but all sessions in which this voter didn't participate will go through.Unfortunately, this also discards all the votes of honest voters in the dropped session.To overcome this exclusion we allow each voter to vote k times: in other words, each voter will cast k votes into k independent ballots amongst the M .
Our claim is that in this case, the final tally's result reflects the choice of honest voters even after discarding all the sessions that were blocked by a dishonest voter.Furthermore, when several voters are dishonest, their cumulative effect on the final tally is weighed down by the fact that they shared many vote sessions.Concretely, for k = M/2, the first dishonest voter makes about M/2 sessions invalid; but amongst the remaining sessions only about M/4 can share a second dishonest voter, etc.Hence, this setting tolerates roughly log 2 M dropouts, at the price of running M sessions.
In summary, by running several sessions, several competing phenomena occur: 1.The overall protocol's resilience against DoS events is improved as we run more sessions -more sessions however bring an additional computational and communication cost; 2. Sessions have a varying number of voters in them, and not every voter partakes in every session, which introduces a bias -we can expect this bias to become small when many sessions are run; 3. The list of participants in each session is public, therefore some information about individual voters' preferences is leaked -running more sessions results in a increased loss of privacy.
There is therefore a balance to be struck, and we must quantify these phenomena more precisely.

Parallel OV-Net DoS resilience
Let be the number of voters causing a DoS event; they cause a (random) number X of sessions to be discarded.The protocol fails when all sessions have been discarded, i.e., when X ≥ M -this cannot happen when < M/k.If ≥ M/k then it is possible to stop the protocol entirely when the selections of dropping voters cover all sessions.However, the likelihood of this happening when each selection is random and independent is low, as many of the dropping voters will have sessions in common.This is a particular variant of the famous coupon collector's problem, which has been extensively studied.
Lemma 1.The average number of DoS events necessary to cause an overall failure, when we run M parallel sessions and each voter partakes in k of them is When we have fewer than the critical number of DoS events, the remaining sessions can be tallied.We can estimate the number of remaining valid sessions as µ = M − X : Finer results about the distribution X are given in Appendix A.5 in [1].

Tally-combining algorithms
In this section we formalise how a final result can be obtained from the parallel OV-Net protocol.It is practical at this point to use vector notations.
We make the assumptions that voters are consistent, i.e., that they make the same choice across all the voting sessions in which they participate 9 .We denote v i the choice of voter i, and collect this (unknown) information into a vector v = (v 1 , . . ., v n ).If the vote went through with no incident, we would obtain the final tally : When a voter drops out, all the sessions in which he participated are discarded.Let 0 < µ ≤ M be the number of remaining sessions and for each session j ∈ {1, . . ., µ} let s j,i be the number of times that voter i participated in session j; hence s j,i can take values in {0, 1} with the minimum value meaning that voter i did not partake in session j, and the maximum value indicating that they voted during session j.The tally for session j is therefore t j := n i=1 s j,i v i = v • s j where s j := (s j,1 , . . ., s j,n ).By definition, s j,i = 0 if voter i dropped out, and s j is non-zero (otherwise µ = 0).At the end of the procedure, the following information is public knowledge: T := (t 1 , . . ., t µ ) S := (s 1 , . . ., s µ ) The question is now: given (S, T ), and the parameters pp = (n, k, M, µ) how well can we approximate V ?To answer this question we need a precise definition of the error.
Definition 1 (Average-and worst-case error).Let A be an algorithm taking as input S, T and (implicitly) pp, and returning a real number.We refer to A as a tally-combining algorithm, and we write δ(v, S) := V − A(S, T ) for the tallying error.
Since δ depends on a choice of v, which is not public information, and since S is a collection of randomly chosen selections, it is more meaningful to consider the average error: where v and S span all their possible values.While A may give results that are close to V on average, there may be corner cases in which the predicted value wanders substantially away from V ; this phenomenon is controlled by the worst-case error: where again v and S span all their possible values.
A simple tally-combining algorithm is given by averaging the tallies and rescaling to account for lost sessions, i.e.
(we must divide by k since each voter casts k votes).
See also [1] for the worst case values.More generally, let x = (x 1 , . . ., x µ ) be a vector of real coefficients, and define the weighed tally-combining algorithm A x (T ) = x • T , which gives the result How do we choose x?The following result partially answers this question Theorem 1.A sufficient condition for the bias of A x to be zero in average is Proof.See Appendix A.4 in [1].
If S spans R n , then by definition of a generating family we can find {x 1 , . . ., x µ } such that w = 1. 10 Concretely, we can construct an orthonormal basis of R n from vectors of S and project 1 onto each coordinate.We dub this method of computing x the minimum variance tally-combining algorithm (MV, Table 1).When S span R n , the MV algorithm gives an exact result (zero bias and variance).
Table 1.Algorithm for minimum variance tally combining (MV). 10The average value of µ such that S spans R n is n k=1 See [4] for more precise results.
However, when S does not span R n , the MV algorithm can only find a vector w close to 1, namely the closest such vector in terms of Euclidean distance that can be expressed in terms of vectors in S.This is still the solution resulting in the smallest variance, but no longer the solution with the least bias!This leads us to consider the following approach: we can construct tallycombining algorithms that guarantee zero bias, and select amongst these an algorithm that minimizes variance.Indeed, the constraint 1 • (1 − w) = 0 can be guaranteed by determining x 1 as a linear function of other variables 11 .It remains to minimize 1−w 2  2 which is simply a quadratic form in µ−1 variables.Therefore its minimum is easy to find as it amounts to solving a linear system in µ−1 rational variables.We call the corresponding algorithm the zero-bias minimum variance tally-combining algorithm (ZBMV, Table 2).In table 2, "symbolic expression" refers to the notion that x 1 , . . ., x µ are not evaluated but are symbols to be manipulated formally.
Table 2. Algorithm for zero-bias minimum variance tally combining.
Algorithm 1 (Zero-bias minimum variance) We can express x 1 in terms of x 2 and x 3 to ensure zero bias: participated in election i and we consider that the elections are enumerated from 1 to M .Let Res(M i ) be the random variable that gives the number of 'Yes' votes in the set M i .We recall also that Y i is the random variable that gives the number of voters in the set M i .

Definitions and Assumptions
To quantify privacy, we use the δ-privacy definition for voting from [14] which assumes that, besides the voting elements of a voting protocol, there exists an additional party called an observer O, who can observe publicly available information.Moreover, we assume that among the n honest voters, there exists a voter V obs who is under observation.For the sake of clarity, V obs will refer at the same time to the voter under observation and to its vote.
Definition 2. Let P be a voting protocol and V obs be the voter under observation.We say that P achieves δ-privacy if the difference between the probabilities is δ-bounded as a function of the security parameter , where π O , π V obs and π v are respectively the programs run by the observer O, the voter under observation V obs and all the honest voters v (clearly without V obs ).
To calculate the privacy we use the following result from [14] where M * Yes,No = {r ∈ R : A Yes r ≤ A No r }, R is the set of all possible election results and A j r denotes the probability that the choices of the honest voters yield the result r of the election given that V obs 's choice is j.
We consider a referendum with n honest voters with a uniform distribution between yes and no votes.For simplicity, we will assume that nobody abstains.We also assume that no voters are corrupted.This is reasonable, since instructing corrupted voters to vote in a special way does not give further advantage compared to simply knowing the corrupted voters' votes.Moreover, we assume that at least one of the elections in which V obs participated is surviving.

Basic Cases:
The δ for a single referendum is : where the first equality holds using the result from ( 1) and the second one using the binomial theorem.The formula above refers to the case M = k = 1 where all voters had chosen to vote in the same and unique election 1.For the case M > 1 and k = 1, δ becomes a random variable and the expected value of δ of the election in which V obs is participating can be defined as follows: where Y i is the random variable that gives the number of voters who participated in the election i, including V obs ; and 2) for k = 1 and M > 1 becomes: Figure 2 shows that privacy is almost lost when M n.

General Case
In this part we give a general formula of δ.To this end, we consider the following.Let y = (y 1 , . . ., y M ) be an assignment of voters such that Card(M i ) = y i for i ∈ [1, M ].We can obtain all the possible assignments of voters by respecting the condition gives the number of "Yes" votes in M i .We have Res(M i ) ∼ BD(y i , 1  2 ) for i ∈ [1, M ].Intuitively, δ can be expressed as the following: By definition of A j r we have A j r = P(Res(M 1 ) = r 1 , . . ., Res(M M ) = r M /V obs = j) with j ∈ {Yes, No}.
To proceed we will introduce an additional notation.Remember that M i denotes the voters in election i. Define Σ k as the subsets of {1, . . ., M } of cardinality k.For σ ∈ Σ k we define M σ = i∈σ M i , i.e. the voters participating in the elections in the set σ.Note that the assignment of voters to elections is uniformly random, i.e. each voter is assigned uniformly and uniquely to a M σ .Also Z σ is the random variable determining the number of voters in M σ .
There are c = M k possible M σ s .Suppose that σs are enumerated from 1 to c.Let z = (z σ1 , . . ., z σc ) be an assignment of voters such that All the possible assignments of voters z are obtained by respecting the condition σi∈Σ k z σi = n.
The variables Z σ , σ ∈ Σ k correspond to the problem of putting n indistinguishable balls into c distinguishable boxes, i.e. the vector Z = (Z σ1 , . . ., Z σc ) follows a multinomial distribution with equal parameters p i = 1/c, and σ∈Σ z σ = n including V obs .We can now calculate the probability for the assignment of the voters, and rewrite our formula as: Let r = (r σ1 , . . ., r σc ) such that r σi = Res(M σi ) for (σ i , i) ∈ Σ k × [1, c].The variables Res(M σ ), σ ∈ Σ k , are independent and follow the binomial distribution of parameters z σ and 1/2.In the case M = c, which means k = M − 1 or k = 1, there is a one-to-one correspondence between the sets (M i ) i∈ [1,M ] and (M σ ) σ∈Σ k .However this is not true in general and we have a relation between r and r defined by the function f as follows: We can now calculate the probability A v r as: A v r = r |r=f (r ) A v r and we have: A v r = P(Res(M σ1 ) = r σ1 , • • • , Res(M σc ) = r σc /V obs = v) Suppose that V obs is in the subset M σ1 .It is symmetric to choose any other subset.We have: where h(x, y) = x−1 y−1 if v = "Yes" x−1 y if v = "No" privacy decrease that is expected due to the multiple partial election results.The results allow the protocol initiator to choose parameters to carefully balance the wanted robustness with a controlled privacy loss, statistical loss in accuracy, as well as increased computation.
Future work An idea to consider is redistribution i.e. elections are conducted in several electoral districts.Unlike general elections, where the final result is known for the entire country only, in redistributed elections results are consolidated per district and only then added up.This could confine problematic voters to a district of their own, as follows: partition the n voters into d districts of n = n/d voters, then run a vote in each of them.Then recompose the result by adding up the final tally.This strategy confines the DoS problem to districts that do not influence each other.However, DoS tolerance is not exactly multiplied by d because each district is not allowed to exceed k unresponsive voters.In other words, tolerance is multiplied by d as long as the constraint that there are no more than k unresponsive voters per district is respected.

Figure 1 Fig. 1 .
Figure 1 compares simulation results to the formula of Lemma 1, showing excellent agreement.The simulation is for M = 50 and k varying from 1 to 49, over 10 5 runs 8 .Using this information, we can choose parameters M and k to accommodate a given number of potential drop-outs.