# Effective group size of majority vote accuracy in sequential decision-making

## Abstract

We investigate a sequential decision-making situation wherein, with a certain probability, each voter imitates the decision of another voter who has already made a decision; otherwise, she makes a decision independently. After all individuals reach decisions, the group decision is determined using the simple majority rule. To evaluate the collective performance in this situation, we introduce the concept of effective group size, which measures how many independent voters are needed to obtain the same majority vote accuracy realized by non-independent sequential votes. We have found the deterioration of majority vote accuracy by imitation behavior of voters, and quantified it by a decrease in the effective group size. We argue that this decline in the majority vote accuracy is caused by the two factors: a decrease in the number of independent voters and an increase in the disparity of influences of voters on succeeding voters’ decisions.

### Keywords

Condorcet’s jury theorem Effective group size Majority vote Sequential decision-making Social influence Wisdom of crowds### Mathematics Subject Classification

91C99## 1 Introduction

Human society heavily depends on the performance of group decision making. Decisions on what kind of international treaty to conclude, which presidential candidate to choose, and whether to raise a tax or not influence our quality of life. Further, group decision making has recently attracted attention in not only social sciences but also biology (e.g., [12, 20, 24]). This has occurred because in the biological context, group decision accuracy can affect each group member’s reproduction and survival, as observed in migration decisions, foraging decisions, and decisions on collective activities [10, 11]. Thus, analyzing the conditions that enhance or deteriorate group decision accuracy would lead to a better understanding of both human and non-human societies.

*p*for all voters. When each voter makes a decision independently, the probability that the majority vote is correct is

*N*(odd) is the group size. This theorem states that if

*p*is greater (less) than one-half, majority vote accuracy \(P_{N}\) is an increasing (decreasing) function of the group size

*N*. Other extensions of this theorem exist: supermajority rule [14], hierarchal voting [6], game theoretic situations [1, 3], and logically interconnected multiple agenda [15, 25]. Furthermore, researchers have relaxed the assumptions of independence because, when one takes influences of social interactions into account, it is easy to dispute the assumption of independence. Boland et al. [9] investigate how dependence among voters affects majority vote accuracy by modeling the situation where all individuals can refer to the same opinion leader. In addition, a series of studies by Berg [4, 5] and Ladha [21, 22, 23] deals with the aggregation of dependent votes, in which the probability of multiple individuals’ making a correct decision simultaneously differs from the product of the probabilities of each player’s making a correct decision.

Almost all studies so far have assumed simultaneous decision making or ignored the process that generates dependence among individuals. However, in reality, the times of making decisions are often non-simultaneous among individuals. For example, as Banerjee [2] noted, decisions on what restaurants to choose, what academic research topics to work on, and how many children to have are influenced by opinions of people who have already made such decisions. Thus, imitation processes associated with that fact should be considered. Theoretical studies on information cascade have often dealt with non-simultaneous decision-making timings. For example, in their influential study, Bikhchandani et al. [7] considered the sequential decision-making situation where each individual intends to choose a correct option by referring to the decisions of all predecessors. In this framework they examined the likelihood of correct decisions made by succeeding individuals.

In the present study, to obtain better insight on the accuracy of non-simultaneous collective decision making, we introduce a novel concept in this research field, *effective group size*. This concept helps us to quantitatively decompose the deterioration of majority vote accuracy in the process of sequential decision-making into two factors.

This paper proceeds as follows. Section 2 constructs a model. Section 3 evaluates to what extent majority vote accuracy deteriorates through a sequential decision-making process by using multiple criteria. Section 4 provides an intuitive explanation of the deterioration and concludes the paper.

## 2 Model

A group of *N* voters faces a dichotomous choice problem. The correct alternative is determined a priori. Voters sequentially make decisions. Once a voter makes a decision, she never changes it. Throughout this study, we call voters who have not made decisions yet at a given time *undecided voters*. Voters who have already reached decisions are called *decided voters*. We here assume naïve voters who do not know the number of total votes for each choice. Those naïve voters do not know whether others have already reached decisions either, before they directly refer to them. To model this social process, our model contains undecided voters as well as decided voters.

In an elementary step of update, a *focal voter* is chosen randomly from the undecided voters. With probability \(1-s\), the focal voter makes a decision independently. In this case, the focal voter makes a correct/wrong decision with probabilities *p* and \(1-p\), respectively, and this elementary step ends. Note that *p* can be interpreted as the voter’s competence or the reliability of the information that the voter receives. With the remaining probability of *s*, the focal voter randomly chooses another voter from the entire group as an *exemplar*. The focal voter imitates the exemplar’s decision if the exemplar has already made a decision, and nothing happens otherwise. This ends the elementary step. These elementary steps of update repeat until all individuals reach decisions. We assume that at the initial state no one has made a decision. Thus, if \(s = 1\), the first focal voter cannot refer to anyone. Therefore, for \(s = 1\) we assume that the first focal voter is forced to make a decision independently, and then all subsequent voters imitate another voter’s decision. After all individuals made decisions, the group decision is determined using the simple majority rule. Therefore, when \(s = 0\), the situation is identical to the independent voting assumed in Condorcet’s jury theorem. We call the model described above *Model 1*.

Among decided voters, those who have already reached decisions independently are called *independent voters*, and those who have reached decisions by imitating another voter are called *imitators*^{1} (see Fig. 1 for classification of the voters). Let \(X_{t}\) and \(Y_{t}\) be the number of independent voters and that of imitators at time *t*. We also have their frequencies, \(x_{t}\,({=} X_{t} /N)\) and \(y_{t}\,({=}Y_{t}/N)\), respectively. Note that \(1-x_{t}-y_{t}\) is the frequency of undecided voters at time *t*.

In summary, in an elementary step of update in Model 1, the value of \(X_{t}\) (independent voters) increase by one with probability \(1-s\), the value of \(Y_{t}\) (imitators) increases by one with probability \(s(x_{t}+y_{t})\), and no changes occur with the remaining probability.

*Model 2*, which we analyze in Appendix A.

## 3 Results

This section analyzes the outcome of our process, with special attention to the resulting majority vote accuracy. First, we calculate the frequency of independent voters when all voters reach decisions (Sect. 3.1). Second, using the concept of effective group size, we demonstrate that the smaller the frequency of independent voters, the worse the majority vote accuracy becomes; however, this phenomenon does not explain the full range of our results (Sect. 3.2). Third, we evaluate the resulting disparity among independent voters’ influence on succeeding voters, which proves to be the other factor causing the deterioration of majority vote accuracy (Sects. 3.3 and 3.4).

### 3.1 Frequency of independent voters

*N*. The time-change of the frequencies, \(x_{t}\) and \(y_{t}\), are described approximately by the following differential equations.

### 3.2 Effective group size

Our model is an extension of Condorcet’s model relaxing the assumption of independence. To what degree does a decrease in the number of independent voters explain majority vote accuracy? This section provides a criterion to answer that question.

*N*is sufficiently large, it can be approximated by the central limit theorem as

*N*we can estimate the accuracy of majority vote of that group, \(P_{N}\), for the case of all votes being completely independent. Therefore Eq. (6) can be used in the opposite direction. Let \(P^{*}\) denote the majority vote accuracy obtained in our sequential decision-making model. By substituting \(P^{*}\) for \(P_{N}\) in Eq. (6), we can calculate the group size

*N*that would lead to the majority vote accuracy \(P^{*}\), if all votes were independent. Hereafter, we write this

*N*as \(N_{\textit{eff}}\) and call it the

*effective group size*. \(N_{\textit{eff}}\) measures how many independent voters are needed to obtain the given majority vote accuracy that was actually realized by dependent voters’ judgments.

*s*is, the smaller is the effective population size. Further, the gap between the number of independent voters and the effective group size shown in Fig. 3 implies that the deterioration of majority vote accuracy is not explained solely by the decrease in the number of independent voters. In Sect. 3.3, we consider another factor causing the deterioration of majority vote accuracy.

### 3.3 Relationship between the times at which independent voters make decisions and their influence

The result of Sect. 3.2 indicates that the decrease in the number of independent voters cannot alone explain the complete range of the deterioration of majority vote accuracy. To identify another factor causing this deterioration, we pay attention to the path of imitations. After all voters reach decisions, we align them in the order of when their made decisions, and name them from 1st to \(N\hbox {th}\) voters. Imagine the case where the \(k\hbox {th}\) voter imitated the \(j\hbox {th}\) voter’s decision \((j < k)\). The \(j\hbox {th}\) voter might have also imitated the \(i\hbox {th}\) voter’s decision \((i<j)\). In this case, we should attribute the \(k\hbox {th}\) voter’s decision being the same as the \(i\hbox {th}\) voter’s to the fact that the \(j\hbox {th}\) voter imitated the \(i\hbox {th}\) voter. This path of imitation implies that the influence of the \(i\hbox {th}\) voter reaches not only the \(j\hbox {th}\) voter but also the \(k\hbox {th}\) voter. Starting from any given focal voter, we go upstream along this path of imitation until we reach an independent voter, and call her the *origin* of the focal voter. Note that the origin of an independent voter is herself. In contrast, for a given independent voter, those who refer to her as their origin are called this voter’s *followers*. By definition an independent voter is a follower of herself. We call the number of followers one’s *influence*.

*q-quantile independent voter*, where \(q=i/X_{t^{*}}\,(0<q\le 1)\). The growth of the frequency of the followers of this

*q*-quantile independent voters is described approximately by the following differential equation:

*q*-quantile independent voter, and \(t_{q}\) is the time at which the

*q*-quantile independent voter reaches a decision. Equation (7) is not valid before time \(t_{q}\). This equation states that the instantaneous rate of increase of the frequency of followers is

*s*times the frequency of followers at that time. This simplicity comes from the fact that indirect followers are not distinguished from direct followers in considering the diffusion of an origin’s decision. The initial condition in Eq. (7) comes from the fact that the origin herself is also counted as her follower. Solving this differential equation, we have

*q*-quantile independent voter at the final time as follows:

*q*. Therefore, sorting voters by the order in which they make decisions is equivalent to sorting their influence from the largest to the smallest. Figure 4 presents the result of individual-based simulations and the analytical approximation obtained by Eq. (9) in one graph. As the figure illustrates, our approximation matches well the expected frequency of followers.

### 3.4 Disparity of influence among independent voters

*s*increases in order to explain the relationship between imitation intensity and inequality of influence (see Appendix B for our reason for choosing the Gini coefficient for this purpose). Although the distribution of influence is actually discrete in our model, for simplicity, we pursue this problem by regarding it as a continuous distribution. The Gini coefficient for continuous distributions is defined as follows:

*q*-quantile earliest independent voter is the \(q'\)-quantile latest independent voter. Moreover, \(L(q')\) is the Lorenz function, which represents the proportion of the followers of 0-quantile to \(q'\)-quantile independent voters [26] (see also [16]). If all voters have the same numbers of followers, the graph of the Lorenz function (called Lorenz curve) lies on the diagonal. Thus, Eq. (10) suggests that the Gini coefficient is obtained by doubling the area between the diagonal line and the Lorenz curve. A possible value of

*G*ranges from 0 to 1. \(G=0\) means perfect equality (i.e., all voters have the same number of followers), and \(G=1\) means perfect inequality (i.e., a single voter is followed by all the remaining voters).

*G*as a function of the imitation rate

*s*as

*s*(see Appendix C for the proof).

## 4 Discussion and concluding remarks

This study models the sequential decision-making situation wherein each voter makes a decision at a different time referring to a predecessor’s decision, investigates factors that worsen the majority vote accuracy, and evaluates to what degree those factors are responsible. As a result, we find that the deterioration of majority vote accuracy is attributed to two types of mechanisms. First, majority vote accuracy deteriorates as the number of imitators increases, as Fig. 3 illustrates. To understand this process intuitively, let us consider an extreme situation, where a single independent voter is imitated by all the remaining voters. According to Condorcet’s jury theorem, majority vote accuracy increases as the group size increases. Nevertheless, in this case, the collective decision is the same as a single voter’s decision, and therefore the group loses the advantage of size. This outcome is compatible with [9], who assumed simultaneous correlated voting, although our results differ quantitatively from theirs, as mentioned in Sect. 3.1.

Second, the disparity of influence among independent voters worsens the majority vote accuracy, as Figs. 4 and 5 illustrate. Let us consider the following situation to explain this mechanism. There are three voters making decisions independently. Let us assume that the weights for the three votes are 100, 2, and 1, respectively. The difference among the weights for their votes corresponds to the inequality of influences in our model. Although there are actually three voters, this situation is the same as the case wherein only the first voter casts a vote. That is, because we adopt the simple majority rule, the last two voters cannot affect the collective decision, and the first voter’s single vote represents a collective decision. Clearly, the probability of a single voter making an incorrect decision is higher than that of the unweighted majority votes by three voters. Again, this group does not take advantage of its size. This outcome is related to the study by [27], which addressed the optimal weighted majority rule. They proved that the weighted majority voting system in which the weight for the vote by individual *i* with competence \(p_{i}\) is \(\hbox {log}[p_{i}/(1-p_{i})]\) maximizes the majority vote accuracy under the assumption that all competences are larger than one-half. Therefore, under the assumption of equally skilled individuals, as in the present study, the majority rule of distributing weights equally to everyone is the optimal weighted majority rule. The inequality of influences represents the deviation from the optimum. In summary, as the imitation rate *s* increases, the first mechanism is driven, and then the number of independent voters decreases. Further, the inequality of influences among independent voters is generated with increasing *s*. We have evaluated to what extent the combination of these two mechanisms worsens the majority vote accuracy by using the novel criterion called effective group size.

To highlight the second mechanism, Appendix D compares the majority vote accuracies of two models, which generate the same frequency of independent voters when all voters reach decisions. One is our *Model 3*, which is a sequential decision-making model. The other is the Model 1 in [9], which describes a simultaneous decision-making situation. We find that the two models generate different majority vote accuracies. This implies that in considering the group decision accuracy, not only the frequency of independent voters but also the process through which the frequency is determined or the circumstance in which voters are involved should be taken into account.

Several theoretical studies have previously investigated weighted voting (e.g., [27]). However, in these studies, weights are determined exogenously, and the researchers do not discuss how the weights are determined through social interactions. In contrast, the current study investigates how the distribution of influences, which is qualitatively equivalent to the distribution of weights, generated endogenously in the process of sequential decision-making affects majority vote accuracy. Consequently, we demonstrate that the distribution of influences among independent voters follows an exponential distribution (Eq. (9)). In Appendix E, we show that the conditional distribution of influences among all voters given the voter is an independent voter follows a power-law relationship (see Eq. (32)).

Our model might be too simple to explain the micro-foundation that worsens group decision accuracy. For example, in the present model, voters’ competence is assumed to be homogeneous. However, in the real world, individuals with different competences make collective decisions. The assumption of heterogeneous competence raises a new research question (cf., [28]). Studying how the correlation between a voter’s competence and imitation rate affects majority vote accuracy would be interesting. Moreover, we treat naïve imitations, whereas in the sequential decision-making model by [7] each individual makes a decision by calculating the probability of each option to be correct based on the observation of all predecessors’ actions. Future studies should investigate how different micro-level decision-making processes affect group decision accuracy by incorporating empirical studies. Last, our idea of tracing the path of imitation and then deriving the disparity of influence requires only the imitation rate *s*, and does not the distribution of competences. Therefore, our idea is expected to be a useful tool for future works.

Our notion of imitators is the same as that of *copycat voters* in the model of [17]. Moreover, our assumption that independent voters and imitators appear randomly is similar to that of [17]. However, as defined in the first paragraph of Sect. 2, our Model 1 allows undecided voters to exist, whereas [17] does not. Owing to the existence of undecided voters, an imitator who samples an undecided voter gives up the imitation, and potentially becomes an independent voter next time. This assumption entails the non-trivial mathematical result for the frequency of independent voters at the final time, which is expressed as Eq. (5) in the next section. Furthermore, although [18] calculates the majority vote accuracy in sequential decision-making, that paper assumes that a *herder* votes for a candidate obtaining majority vote at the time of the herder’s decision-making.

## Acknowledgments

The authors thank three anonymous reviewers for their helpful comments. This work was supported by the Grant-in-Aid for JSPS Fellows Grant Number 13J05358 (TS) and JSPS KAKENHI Grant Number 25118006 (HO).

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.