Voting Theory for Two Parties Under Approval Rule

The Simple Ballot Model (SBM) and the Component Ballot Model (CBM)—are proposed for solving uncertainty in an election when two candidates gain the same number of votes under the approval rule. The SBM establishes a framework to support counting. In separating the two candidates, it is essential to extract additional information from dominantly valid votes. The CBM uses probability matrices, vectors and permutation group as components. A stable-voting mechanism under permutation invariant can be created to distinguish candidates. The result of the chapter establishes a voting authority to resolve uncertainty of two candidates under the approval rule.


Introduction
As a common practice in a modern democratic society, voting is a practical way to resolve a contest where each candidate seeks to gain maximal support from the electors. Approval voting is a voting procedure in which electors can vote for as many candidates as they wish. Each candidate approved of receives one vote and the candidate with the most votes wins. Approval voting, unlike more complicated ranking systems, is easier and simpler for electors to understand and use. This voting method has been widely used today by various governments and organizations around world (including the use by the United Nations to elect the secretary-general).
To keep healthy economic and political progress in modern democracy societies, it is necessary to apply reliable and convenient voting methodologies and tools to ensure fairness, efficiency and transparency and to overcome paradoxes and difficulties in elections.

Brief Review of Voting Systems
We can find interesting voting-based models and practices in many ancient stories from Chinese literature to Roman and Greek history. Just before the French revolution in the French Academy, de Borda [1] and de Condorcet [2] proposed the Borda rule and the Condorcet procedures. They wanted to use new voting methods to resolve difficulties and unfair results under traditional plurality-based voting rules in elections for the Academy. In 1920s, Hotelling [3] investigated the equilibrium of spatial economic competition for two firms between location and price. During World War II, von Neumann and Morgenstern [4] developed Theory of Games using differential equations to investigate complicated competition behaviors. This theoretical foundation has a superior influence to develop analytical methodologies and tools from applying pre-designed strategic policies to predicting practical election outcomes. Under fairness conditions, Arrow [5] proved his famous Impossibility Theorem which claims that there is no single election procedure to fairly decide the outcome of an election involving more than three candidates. Various ideas, methods and technologies have emerged to resolve voting difficulties [6][7][8][9].

Problems in the 2000 American Election
The most debatable problem in the 2000 American election, the 2K-election, is that Whether the machine-rejected ballots need to be manually recounted?
The practical solution of the 2K-election problem was finally decided by the nine judge's votes in the US Supreme Court on the lawsuits from the Florida Supreme Court.
This indicates that current voting theories and vote-counting models are all faults to be an authority resolving the problem.
Although the 2K-election is under the plurality rule, not under the approval rule, however the approval rule cannot guarantee to avoid the similar uncertainty when a large number of electors are involved. It is necessary to establish relevant theoretical structure to avoid possible problems in the future.

Structure of the Chapter
This chapter proposes two models constructing a voting theory to resolve the 2Kelection-like problems and other paradoxes in voting practices. Only one voting system under approval rule is concerned.
In Sect. 2, a Simple Ballot Model (SBM) is proposed. Using the SBM, the separable and uncertain conditions for the ballot papers are established. To show some practical strategies and relevant problems in current voting methodologies, four additional rules (reducing error probability, merging other candidate votes, re-election, and court decision) that are commonly used in practical voting processes are discussed.
In Sect. 2.8, the error margin for the 2K-election problem is analyzed. Through voting practice is not an accurate science, but the error margin of 0.233% in the event still cannot be acceptable as an accurate measure. Although almost 99.8% of the valid votes were counted, there is still no way of determining that who is the winner. Therefore, the attentions shifts to the 0.2% votes which were already deemed invalid. This problem highlights that the voting system needs to improve, and a method of extracting additional information from valid votes to separate the two candidates under uncertainty conditions becomes essential.
In Sect. 3, a new voting model-the Component Ballot Model (CBM)-is defined and constructed to provide the essential construction for extracting more information from votes for comparisons. Based on multiple feature matrices (similar to contingency tables in classical statistics), probability feature vectors and permutation invariant group and other advanced mathematical tools, multiple pair sets of feature index families for two candidates are constructed. This mechanism establishes a voting authority to make a decision for an election. After the mathematical definitions and constructions to feature matrix, feature vector, probability feature vector and feature index, the most important results are summarized in Two-D Separable Proposition and Voting Authority Proposition.
Taking into account only the valid votes, the election model will have intrinsic stability for the reliable results immediately after the election. Confusion, frustration and dissatisfaction as those experienced in the 2K-election can be avoided.
In the light of this research, some further research directions are suggested in Sect. 4.

Key Words in Election
Key words used in an election event can be defined as follows.
• Election-a special event based on counting votes for a winner (normally whoever attracts the most votes wins the election) • Candidate-a person who has been nominated in an election • Elector-a person who may legally vote in an election • Ballot-a pre-designed form used to record choices of an elector • Vote-a ballot on which the choices of an elector are recorded • Poll-the collections of votes from all legal electors • Decision-Za result on who wins the election.
The Simple Ballot Model simulates the simplest case scenario of whole voting procedure based upon all ballots directly collected from an election under approval rule. In this scenario, one elector can only create one vote for as many candidates selected from a list of candidates.

Definitions
For an ideal election involving n (≥2) candidates, let C {c 1 , c 2 , . . . , c n } be a set of the selected candidates. A ballot B c 1 , c 2 , . . . , c n is a pre-designed form containing the list of candidates for whom the electors may vote.
A vote is a record of a ballot B. An elector can only create one vote and there are a total number of N ( n) votes in the election.
Let V 0 denote the invalid-poll in the election. It collects all invalid votes from the poll V. Let Vc denote a valid sub-poll in the election. Both sub-polls Vc and V 0 partition the poll V . i.e.
Let V k denote a sub-poll in the election. For any k ∈ [1, n], V k collects all valid votes from the poll V for the kth candidate.

2.2)
A SBM is a collection of a ballot form, all votes, poll and poll components for an election. For any poll vectorṼ , let p k |V k |/|V | N k /N , 1 ≤ k ≤ n denote a measure of the kth candidate and p 0 |V 0 | |V | N 0 N denote the measure of the invalid votes.
Under the approval rule, there are many overlaps among different sub-polls. Considering two candidate sub-polls and their common parts, if ∃k, In general, we have Let denote a frequency vector,

One-Dimensional Feature Distribution
The frequency vector corresponds to a density distribution. There are equations as follows.
Because there is no further partition among sub-polls, the vector is composed of a one-Dimension frequency feature histogram.
If sub-polls partition the poll, then there is 1 n k 0 p k . In the worst case scenario, if all valid votes select all candidates without invalid votes, then p 0 0, p 1 · · · p n 1,

Separable Condition
When ∃i, j ∈ [1, n], p i , p j > p 0 , a decision between the candidates i and j can be made if and only if This is the separable condition.

Uncertain Condition
However, there will be intrinsic difficulties to make a decision between the candidates i and j simply from their measures p i and p j , if This is the uncertain condition. Under the uncertain condition, there are no simple solutions to distinguish signals clearly between p i and p j under the interference of p 0 .

Balanced Opposites
It is extremely hard to make any decision when both candidates gain the same number of votes in an election. However, for any equilibrium dynamic system involving two balanced opposites in competition, the most probable trends are p j p i . In general, more complicated feedback mechanisms are involved and balanced events occur more frequently [10,11].

Four Additional Policies
To resolve conflicts in an election, four additional policies may be useful: reducing error probability ( p 0 → 0), merging other candidate votes The reducing error probability policy works well in certain conditions involving only a small number of electors. Using various controlled methods, e.g., the total number of seats in Parliament being an odd number or some additional votes allowed by Parliament Leaders, the worst case scenario where both candidates hold equal votes without a decision can be eliminated. However, when an election involves a large number of electors like sizes of the 2K-election, the voting system becomes a naturally complex dynamic system and there is no way to make the error margin The merging other votes policy works in simple conditions at a single location. To combine votes for candidates from multiple locations under approval rule would be more difficult than under plurality rules since there are many overlaps among subpolls. There is no guarantee to ensure the policy work. In the best cases, old difficulties may be temporarily solved, but new similar uncertainties could immediately emerge.
From a complex-dynamic system, re-election is as same as the original election. Therefore, the re-election policy cannot provide improved separable property between two candidates.
If other solutions can not be found by timing or other issues, then it is feasible to use Courts to make decision. The court decision policy uses Courts to make decision, it results in efficient decision-making but breaks down the election procedure and it may loose fairness, transparency, self-determination and other advantages of the election process.

How Accurate Is Accurate?
It is well known that all measurements in physics and in all exact science are inaccurate in some degree. So, what then is sufficient to be deemed accurate for an election? Can we accept a 10% margin of error to be accurate? What about 1% or even 0.1%?
In real life, an error margin of 1% would be highly commendable and one of 0.1% would be considered highly accurate.
Although, voting and polling were not meant to be an exact science, polls and other pre-election statistics had error margin of almost 5-10%. Yet in the actual election, the margin of error was less in the disputed counties, e.g. Miami-Dada and Palm Beach, only 14,000 votes from a total number of six million votes were rejected. The margin of error was only 0.233%. Usually, this would be deemed a negligible number, as almost 99.8% of votes were valid. However, it was not enough to separate the two candidates, this margin would have to reduce the rejected votes from 14,000 to 100. In the condition, at least an error margin of 0.00016666% is required. This is highly improbable due to the cost, time and other factors.

Shifting Attentions from Invalid Votes to Valid Votes
Almost 99.8% votes are valid. This indicates that in order to determine who will be the winner under the uncertain condition, it is necessary to fetch additional information to determine a victor from valid votes instead of reducing the error margin by handling invalid votes. The total number of votes is far greater than the number of candidates. This makes possible to extract additional information using crossclassification methods based on contingency table-like techniques among multiple categories. The cross-classified technique is a powerful toolkit in modern statistics [12,13,14,15].
Under additional categories such as location, age group and sex, valid votes will be categorized as two-dimensional classified feature distributions in respective contingency tables. Such spatial or histogram-like feature distributions provide invaluable information to support improving separable properties between two uncertain candidates. To represent this idea, a new model is proposed in next chapter.

Component Ballot Model
To overcome the intrinsic complexities and uncertain problems in approval voting practices, a new model-the Component Ballot Model-is proposed in this chapter to use multiple variables on a ballot for a better description and an easier comparison.

Definitions
To be consistent with the previous notation, similar symbols (ballot paper) are used. However, the contents of the ballot paper and other notations will be compounded into vector forms.
Let C {C 1 , C 2 , . . . , C m } be a set of the selected conditions. The i-th item A ballot B (or a component ballot) is a vector composed of m items: Component items in a ballot provide additional information about elector to the paper such as sex, voting time, location, age group, and minority, living area, social security and employ situations.
For example, the first item contains 10 candidates, the second item presents 100,000 locations, the third item has 3 sex groups (male, female, neutral), the forth item contains 150 age groups, and the fifth item indicates 10 10 social security number. Under above conditions, a ballot paper could be  Additional information for electors may been accessed from existing election databases somewhere, there is no any technical difficulty to merge them to be a compound vote automatically using modern information technology.
There are enough rooms for an elector with various parameters on a vote and a total number of N electors in voting.
A poll V is a vote collection in which all votes can be arranged as an array with N entries: Considering each vote has m items, a poll V can be represented as a 2D m × N array. (3.4)

Feature Partition
Let V c denote a valid poll and V 0 denote an invalid poll, V c and V 0 partition the poll Let V i denote a sub-poll in the election. For any i ∈ [1, m], V i collects all valid votes of the poll V for the ith item.

Zero-D Feature Lemma
All V i m i 1 sub-polls contain the same votes as in the poll Vc: Proof Using Eqs.
Proof By Eq. (3.9), each vote has an identified value. There is no overlap among possible sub-polls in relation to the category item. It can be noticed that only candidate category does not satisfy one-D feature corollary under approval voting rule. Other additional categories satisfied the condition.
Different from the Zero-D feature lemma, the One-D feature corollary provides non-trivial partition of the votes into multiple sub polls.
Let V 0 denote an invalid-poll in the election. It collects all invalid votes of the poll V.
Since there is no any further distinction for votes in V 0 , all votes in this poll correspond to discarded votes.
Let V i, j k,l denote a sub poll. It can be described as (3.13d) Proof When each vote in the sub-polls has only a single value in relation to the selected category item, the sub-polls partition the selected poll.
Under this construction, all votes in V i, j k,l i, j∈ [1,m] k∈ [1,n i ],l∈ [1,n j ] dissect the valid poll Vc.
When single value condition satisfied, sub-polls can partition the valid poll. sub polls, there is a unique feature matrix representation.

Feature Matrix
Let V i, j denote a feature matrix, (3.14) Using a statistical language, a feature matrix V i, j may correspond to a contingency table based on cross-classified categorical data under two selected categories [13,16,17]. Each element of the matrix collects a sub-set of votes in a respective crosscategorical meaning.

Feature Matrix Set
For a given V i, j k,l i, j∈ [1,m] k∈ [1,n i  Let V SC(i) denotes the matrix set with first index fixed at i, Selecting one category for both row and column values, for a given , a vote in the i th category contains only one valid value, then V i,i k,l can be determined as following.
In this case, the matrix V i, i is a diagonal matrix. However, if V i,i k,l ∈ V i, i in V SC(i), a vote in the i th category contains multiple distinguishable values, then V i,i k,l provides cross-classified sub-polls.
In this case, the matrix V i, i is a symmetric matrix.

Probability Feature Matrix
Let P i, j denote a probability feature matrix corresponding to the matrix P i, j and p i, j k,l denote its element set, for any p i, j k,l ∈ P i, j , For example, n 1 6, n 2 4, a probability feature matrix can be as follows: (3.21)

Probability Feature Vector
For any P i, j , only at most n i row vectors in the matrix need to satisfy Eq. (3.22). Because there is not any restriction among the columns of the probability feature matrix P i, j , such properties make flexible select different categories partitioning a given vote set p i, j k,l into multiple distributions in larger selection spaces to satisfy complicated dynamic system requirements.
For a given P i, j , if the ith item is a categorical index of candidates, then any candidate k ∈ [1, n i ] has a probability feature vector corresponding to its probability densities relevant to item j and denoted by For example, using the sample probability matrix P 1,2 of Eq. (3.21), its polynomial indexes {λ n } are ; . . .

Entropy Feature Index
For a probability vector p 1 , . . . , p j , . . . , p m with m items, an entropy feature index λ E is defined by Eq. (3.37). (3.37) In polynomial index family {λ n ( )} n≥0 , λ 0 ( ) indicates the length of vector and λ 1 ( ) provides the normalized measure. In addition to {λ n ( )} n≥0 family, λ E ( ) provides another type of indexes in relation to the entropy measurement. Using one of these indexes, it is feasible to distinguish two probability vectors in different permutation groups.
For example, using the same probability matrix P 1,2 of Eq. (3.21), its entropy index λ E is

Two Probability Vectors and Their Feature Indexes
Two probability vectors and ∃τ, λ τ ( l ) then the two vectors belong to two different permutation groups.
For two probability vectors i, j k and i, j l , each vector belongs to one permutation group and cannot be generated from another vector then ∃n > 1, λ n Under such conditions, if two vectors have different index families, then they are in different permutation groups. In another way, when two vectors cannot be generated from another one, at least one indexes is distinguishable.

CBM Construction
Let CBM denote a Component Ballot Model. A CBM is a collection of a ballot form, vote sequences, poll and poll component matrix collection, probability matrix collections with normalized probability vectors plus the selected indexing family for an election.
Compared with SBM (Eq. 2.3) and CBM (Eq. 3.38), it is clear that the SMB is the simplest case of CBM and CBM provides more powerful properties for refined descriptions and comparisons in complicated voting applications.

Two-D Separable Proposition
For two candidates to gain similar number of votes in the uncertain condition, it is always feasible to use other categorical information (i.e. location, age group) to re-partition sub polls for each candidate. If the two refined probability feature vectors belong to two permutation groups, then the uncertain problem can be solved in most case scenarios by using the polynomial feature index family or the entropy future index.
Proof For most case scenarios, cross-classified categorical data make corresponding probability feature vectors with significant differences in relation to respective density distributions. Under different categories without simple correspondences, this mechanism makes it possible to use the same strategy to handle votes for candidates. Since one party may be very strong in certain polices and relative weak in other strategies, those differences create various probability feature vectors easier located in different permutation groups. Even in the most balanced election events from a global viewpoint, hugely distinguishable distributions exist in local regions. This is the most important reason for two probability feature vectors making a pair of significantly distinct feature indexes.
In a complex dynamic system, equilibrium is the most probable state when the system is in dynamic balance. However, there are significant differences among local areas even in the most equilibrium conditions. This is the most powerful part of proposed model for solving uncertainty in general for complex dynamic systems.
For an election to avoid uncertainty and frustrations due to the voting result in uncertainty, it is necessary to pre-select additional odd m −1 ≥ 1 categories different from candidates. Following main conclusion can be statement.

Voting Authority Proposition
If two candidates in an election under approval rule are in uncertainty, then additional categories (odd m − 1 ≥ 1) under pre-agreed conditions could be used. These create the m − 1 pairs of feature indexes for making the decision for who will be the winner.
Proof According to the two-D separable proposition, each additional category can provide a pair of significantly distinct feature indexes to separate the two candidates, and all selected m − 1 pairs have such properties. Considering m − 1 an odd number, each pair of indexes acts as an authority vote. So, there is no problem using the majority rule to make the decision.

Conclusion and Further Work
In the proposed Component Ballot Model, multiple probability-feature matrix collections are employed and component categories other than the candidate are proposed on ballot papers to overcome confusion and frustration when two candidates are in uncertainty.
Applying advanced invariant constructions to probability feature vectors and also distinguishable properties among measurements in polynomial and entropy feature index families, voting authority provides a stable indexing mechanism to make the whole calculation based on valid votes. Distinguishable properties and invariant properties among feature index families provide reliable measurements for election outcomes.
The basic ideas, tools and technologies in the chapter are originated and created from the author's research works in 1990s for advanced content-based information retrieval and image feature indexing [18][19][20].
Because the approval rule is only one of the rules in practical voting systems, reader may read author's other paper discussing related aspects of voting theory under plurality and majority rules [21]. It is interesting to know whether the proposed new model can apply to other voting systems (such as Borda rules, proportional-representation system and preference voting systems) consistently. Similar uncertainty exists in other voting mechanisms. This will be a natural extension of current study.
To satisfy practical voting systems, it is essential to establish testing frameworks to make recommendations for the specific invariant properties contained in the proposed or new indexing families. There is no doubt that different voting systems may require various combinations of different feature indexing schemes to satisfy their optimal properties. More case studies linking between theoretical models and practical applications should be conducted to solve complicated voting paradoxes and other similar problems.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.