1 Introduction

Human society heavily depends on the performance of group decision making. Decisions on what kind of international treaty to conclude, which presidential candidate to choose, and whether to raise a tax or not influence our quality of life. Further, group decision making has recently attracted attention in not only social sciences but also biology (e.g., [12, 20, 24]). This has occurred because in the biological context, group decision accuracy can affect each group member’s reproduction and survival, as observed in migration decisions, foraging decisions, and decisions on collective activities [10, 11]. Thus, analyzing the conditions that enhance or deteriorate group decision accuracy would lead to a better understanding of both human and non-human societies.

In theoretical studies, Condorcet’s jury theorem has been a highly stylized benchmark for examining the accuracy of group decision making. This theorem considers the situation where a group must choose the correct option from two alternatives. The probability of choosing the correct option is assumed to be p for all voters. When each voter makes a decision independently, the probability that the majority vote is correct is

$$\begin{aligned} P_N \equiv \sum _{h=(N+1)/2}^N{\left( {\begin{array}{l} N \\ h \\ \end{array}}\right) p^{h}(1-p)^{N-h}}, \end{aligned}$$
(1)

where N (odd) is the group size. This theorem states that if p is greater (less) than one-half, majority vote accuracy \(P_{N}\) is an increasing (decreasing) function of the group size N. Other extensions of this theorem exist: supermajority rule [14], hierarchal voting [6], game theoretic situations [1, 3], and logically interconnected multiple agenda [15, 25]. Furthermore, researchers have relaxed the assumptions of independence because, when one takes influences of social interactions into account, it is easy to dispute the assumption of independence. Boland et al. [9] investigate how dependence among voters affects majority vote accuracy by modeling the situation where all individuals can refer to the same opinion leader. In addition, a series of studies by Berg [4, 5] and Ladha [2123] deals with the aggregation of dependent votes, in which the probability of multiple individuals’ making a correct decision simultaneously differs from the product of the probabilities of each player’s making a correct decision.

Almost all studies so far have assumed simultaneous decision making or ignored the process that generates dependence among individuals. However, in reality, the times of making decisions are often non-simultaneous among individuals. For example, as Banerjee [2] noted, decisions on what restaurants to choose, what academic research topics to work on, and how many children to have are influenced by opinions of people who have already made such decisions. Thus, imitation processes associated with that fact should be considered. Theoretical studies on information cascade have often dealt with non-simultaneous decision-making timings. For example, in their influential study, Bikhchandani et al. [7] considered the sequential decision-making situation where each individual intends to choose a correct option by referring to the decisions of all predecessors. In this framework they examined the likelihood of correct decisions made by succeeding individuals.

In the present study, to obtain better insight on the accuracy of non-simultaneous collective decision making, we introduce a novel concept in this research field, effective group size. This concept helps us to quantitatively decompose the deterioration of majority vote accuracy in the process of sequential decision-making into two factors.

This paper proceeds as follows. Section 2 constructs a model. Section 3 evaluates to what extent majority vote accuracy deteriorates through a sequential decision-making process by using multiple criteria. Section 4 provides an intuitive explanation of the deterioration and concludes the paper.

2 Model

A group of N voters faces a dichotomous choice problem. The correct alternative is determined a priori. Voters sequentially make decisions. Once a voter makes a decision, she never changes it. Throughout this study, we call voters who have not made decisions yet at a given time undecided voters. Voters who have already reached decisions are called decided voters. We here assume naïve voters who do not know the number of total votes for each choice. Those naïve voters do not know whether others have already reached decisions either, before they directly refer to them. To model this social process, our model contains undecided voters as well as decided voters.

In an elementary step of update, a focal voter is chosen randomly from the undecided voters. With probability \(1-s\), the focal voter makes a decision independently. In this case, the focal voter makes a correct/wrong decision with probabilities p and \(1-p\), respectively, and this elementary step ends. Note that p can be interpreted as the voter’s competence or the reliability of the information that the voter receives. With the remaining probability of s, the focal voter randomly chooses another voter from the entire group as an exemplar. The focal voter imitates the exemplar’s decision if the exemplar has already made a decision, and nothing happens otherwise. This ends the elementary step. These elementary steps of update repeat until all individuals reach decisions. We assume that at the initial state no one has made a decision. Thus, if \(s = 1\), the first focal voter cannot refer to anyone. Therefore, for \(s = 1\) we assume that the first focal voter is forced to make a decision independently, and then all subsequent voters imitate another voter’s decision. After all individuals made decisions, the group decision is determined using the simple majority rule. Therefore, when \(s = 0\), the situation is identical to the independent voting assumed in Condorcet’s jury theorem. We call the model described above Model 1.

Among decided voters, those who have already reached decisions independently are called independent voters, and those who have reached decisions by imitating another voter are called imitators Footnote 1 (see Fig. 1 for classification of the voters). Let \(X_{t}\) and \(Y_{t}\) be the number of independent voters and that of imitators at time t. We also have their frequencies, \(x_{t}\,({=} X_{t} /N)\) and \(y_{t}\,({=}Y_{t}/N)\), respectively. Note that \(1-x_{t}-y_{t}\) is the frequency of undecided voters at time t.

In summary, in an elementary step of update in Model 1, the value of \(X_{t}\) (independent voters) increase by one with probability \(1-s\), the value of \(Y_{t}\) (imitators) increases by one with probability \(s(x_{t}+y_{t})\), and no changes occur with the remaining probability.

If we focus only on the final frequencies of independent voters and imitators, Model 1 is qualitatively the same as Model 2, which we analyze in Appendix A.

Fig. 1
figure 1

A flow diagram of Model 1. The voters expressed in black on white backgrounds are undecided voters. The voters expressed in white on black backgrounds are decided voters

3 Results

This section analyzes the outcome of our process, with special attention to the resulting majority vote accuracy. First, we calculate the frequency of independent voters when all voters reach decisions (Sect. 3.1). Second, using the concept of effective group size, we demonstrate that the smaller the frequency of independent voters, the worse the majority vote accuracy becomes; however, this phenomenon does not explain the full range of our results (Sect. 3.2). Third, we evaluate the resulting disparity among independent voters’ influence on succeeding voters, which proves to be the other factor causing the deterioration of majority vote accuracy (Sects. 3.3 and 3.4).

3.1 Frequency of independent voters

Although our model is actually a stochastic process, we approximate the dynamics with differential equations because of its tractability. Note that this approximation is reasonable for a sufficiently large N. The time-change of the frequencies, \(x_{t}\) and \(y_{t}\), are described approximately by the following differential equations.

$$\begin{aligned} \left\{ {\begin{array}{l} \frac{{\mathrm{d}}x_t }{\mathrm{d}t}=1-s \\ \frac{{\mathrm{d}}y_t }{\mathrm{d}t}=(x_t +y_t )s \\ x_0 =y_0 =0,\quad x_{t^{*}} +y_{t^{*}} =1, \end{array}}\right. \end{aligned}$$
(2)

where \(t^{*}\) represents the final time when all individuals have made decisions.

Let us derive the final time \(t^{*}\). With \(z_{t}\equiv x_{t}+y_{t}\), we have

$$\begin{aligned} \frac{\hbox {d}z_t }{\hbox {d}t}=(1-s)+sz_t. \end{aligned}$$
(3)

Note that \(z_{0} = 0\) and \(z_{t^{*}} = 1\). By solving Eq. (3) within the range of \(0\le t\le t^{*}\), we obtain

$$\begin{aligned}&\left[ {\frac{1}{s}\log \{(1-s)+sz_t \}} \right] _{z_0 }^{z_{t^{*}} } =[t]_0^{t^{*}}\nonumber \\&\quad \Leftrightarrow t^{*}=-\frac{1}{s}\log (1-s). \end{aligned}$$
(4)

Because \(x_{t}=(1-s)t\) from the first equation in Eq. (2), we have the frequency of independent voters at the final time \(t^{*}\), as

$$\begin{aligned} x_{t^{*}} =-(1-s)\frac{\log (1-s)}{s}. \end{aligned}$$
(5)

Figure 2 compares the frequency of independent voters in an individual-based simulation with that obtained from Eq. (5). This figure indicates that our system of differential equations provides an excellent approximation of the actual stochastic process. Equation (5) and our simulation result show that the value of \(x_{t^{*}}\) decreases drastically at around \(s = 1\). This result is in a sharp contrast with the model of Boland et al. [9], where some voters make decisions independently and others imitate a decision of the same entity simultaneously. The resulting frequency of independent decision makers in their model was a linear function of the imitation rate, \(1-s\).

Fig. 2
figure 2

Frequency of independent voters when all have made their decisions. The solid line indicates the analytical result shown in Eq. (5), and circles are obtained by individual-based simulations. We implemented \(10^{4}\) simulation runs for each value of s. Group size is \(N=101\)

3.2 Effective group size

Our model is an extension of Condorcet’s model relaxing the assumption of independence. To what degree does a decrease in the number of independent voters explain majority vote accuracy? This section provides a criterion to answer that question.

As stated in Sect. 1, under the assumption of independence, the probability that the majority vote reaches the correct decision is expressed by Eq. (1). When N is sufficiently large, it can be approximated by the central limit theorem as

$$\begin{aligned} P_N\approx & {} \frac{1}{\sqrt{2\pi }}\int _{-\frac{p-0.5}{\sqrt{p(1-p)/N}}}^\infty {\exp \left[ -\frac{x^{2}}{2}\right] \mathrm{d}x} \nonumber \\= & {} \frac{1}{2}\left[ {1+\hbox {erf}\left[ \frac{p-0.5}{\sqrt{2p(1-p)/N}}\right] } \right] , \end{aligned}$$
(6)

where erf is the error function, \(\hbox {erf}(x)\equiv \frac{2}{\sqrt{\pi }}\int _0^x {\exp (-z^{2})\hbox {d}z} \). Equation (6) means that for a given number of group size N we can estimate the accuracy of majority vote of that group, \(P_{N}\), for the case of all votes being completely independent. Therefore Eq. (6) can be used in the opposite direction. Let \(P^{*}\) denote the majority vote accuracy obtained in our sequential decision-making model. By substituting \(P^{*}\) for \(P_{N}\) in Eq. (6), we can calculate the group size N that would lead to the majority vote accuracy \(P^{*}\), if all votes were independent. Hereafter, we write this N as \(N_{\textit{eff}}\) and call it the effective group size. \(N_{\textit{eff}}\) measures how many independent voters are needed to obtain the given majority vote accuracy that was actually realized by dependent voters’ judgments.

Figure 3 depicts the relationship between the number of all voters, the number of independent voters, and the effective group size. For example, when \(p=0.7\) and \(s=0.7\), even if 47 voters actually participate, approximately 24 individuals out of 47 make a decision independently, and the majority vote accuracy of our model corresponds to that of Condorcet’s model with approximately 13 independent voters. By comparing the two panels of Fig. 3, we can see that the greater the imitation rate s is, the smaller is the effective population size. Further, the gap between the number of independent voters and the effective group size shown in Fig. 3 implies that the deterioration of majority vote accuracy is not explained solely by the decrease in the number of independent voters. In Sect. 3.3, we consider another factor causing the deterioration of majority vote accuracy.

Fig. 3
figure 3

Relationship between the actual group size, the number of independent voters, and effective group size. The horizontal axis indicates the actual group size N. To avoid defining a tie-breaking rule in a majority vote, we treat only the cases where N is an odd number. The vertical axis represents the analytically-predicted number of independent voters at the end of sequential decision making, \(X_{t^{*}}\) (solid line, from Eq. (5)), and effective group size \(N_{\textit{eff}}\) (black dots, from Eq. (6)). For reference, a diagonal is shown in the dashed line. The parameters are \(p=0.7\) and \(s=0.3\) (left), and \(p=0.7\) and \(s=0.7\) (right). To obtain the majority vote accuracy, we implemented \(10^{4}\) simulation runs for each pair of N and s

3.3 Relationship between the times at which independent voters make decisions and their influence

The result of Sect. 3.2 indicates that the decrease in the number of independent voters cannot alone explain the complete range of the deterioration of majority vote accuracy. To identify another factor causing this deterioration, we pay attention to the path of imitations. After all voters reach decisions, we align them in the order of when their made decisions, and name them from 1st to \(N\hbox {th}\) voters. Imagine the case where the \(k\hbox {th}\) voter imitated the \(j\hbox {th}\) voter’s decision \((j < k)\). The \(j\hbox {th}\) voter might have also imitated the \(i\hbox {th}\) voter’s decision \((i<j)\). In this case, we should attribute the \(k\hbox {th}\) voter’s decision being the same as the \(i\hbox {th}\) voter’s to the fact that the \(j\hbox {th}\) voter imitated the \(i\hbox {th}\) voter. This path of imitation implies that the influence of the \(i\hbox {th}\) voter reaches not only the \(j\hbox {th}\) voter but also the \(k\hbox {th}\) voter. Starting from any given focal voter, we go upstream along this path of imitation until we reach an independent voter, and call her the origin of the focal voter. Note that the origin of an independent voter is herself. In contrast, for a given independent voter, those who refer to her as their origin are called this voter’s followers. By definition an independent voter is a follower of herself. We call the number of followers one’s influence.

We now show that the magnitude of influence of each independent voter is characterized by its order in decision-making. Recall that \(X_{t^{*}}\) is the number of independent voters at the final time \(t^{*}\). Hereafter, we call the \(i\hbox {th}\) earliest independent voter out of \(X_{t^{*}}\) independent voters the q-quantile independent voter, where \(q=i/X_{t^{*}}\,(0<q\le 1)\). The growth of the frequency of the followers of this q-quantile independent voters is described approximately by the following differential equation:

$$\begin{aligned} \frac{\hbox {d}f_q }{\hbox {d}t}=sf_q\quad f_q (t_q )=1/N, \end{aligned}$$
(7)

where \(f_{q}\) is the frequency of the followers of the q-quantile independent voter, and \(t_{q}\) is the time at which the q-quantile independent voter reaches a decision. Equation (7) is not valid before time \(t_{q}\). This equation states that the instantaneous rate of increase of the frequency of followers is s times the frequency of followers at that time. This simplicity comes from the fact that indirect followers are not distinguished from direct followers in considering the diffusion of an origin’s decision. The initial condition in Eq. (7) comes from the fact that the origin herself is also counted as her follower. Solving this differential equation, we have

$$\begin{aligned} f_q (t)=\left\{ \begin{array}{ll} \frac{1}{N}\exp [s(t-t_q)]&{}\quad (t\ge t_q) \\ 0&{}\quad (t<t_q). \end{array}\right. \end{aligned}$$
(8)

Using this result, we can calculate the frequency of followers of the q-quantile independent voter at the final time as follows:

$$\begin{aligned} f_q ({t^{*}})= & {} \frac{1}{N}\exp [s(t^{*}-t_q)] \nonumber \\= & {} \frac{1}{N}\exp [s(1-q)t^{*}]\nonumber \\= & {} \frac{1}{N}(1-s)^{q-1}. \end{aligned}$$
(9)

Here we have used \(t_{q}=qt^*\) (easily shown from \(x_{t}=(1-s)t\)) in the second line, and used Eq. (4) in the third line. Obviously, Eq. (9) is a monotonically decreasing function of q. Therefore, sorting voters by the order in which they make decisions is equivalent to sorting their influence from the largest to the smallest. Figure 4 presents the result of individual-based simulations and the analytical approximation obtained by Eq. (9) in one graph. As the figure illustrates, our approximation matches well the expected frequency of followers.

Fig. 4
figure 4

Relationship between the order in which an independent voter makes a decision (horizontal axis) and the frequency of followers of the independent voter (vertical axis). Each dot is obtained by individual-based simulations. Because the number of independent voters, \(X_{t^{*}}\), is essentially different in each simulation run, we are not able to calculate the average frequency of followers for a specific value of q. Instead, the frequency of followers in the range of \(q=0.01k\) to \(q=0.01k+0.01(k= 0,1,\ldots ,99)\) is calculated for each run, and its average over runs is regarded as the average frequency of followers of the \((0.01k + 0.005)\)-quantile independent voter. Error bars represent one standard deviation. For each value of s, \(10^{4}\) simulation runs were performed to obtain average frequencies. Solid lines are the approximated analytical results obtained in Eq. (9). Group size is \(N = 501\)

3.4 Disparity of influence among independent voters

Next, we observe how the Gini coefficient of the distribution of influence among independent voters changes as the imitation rate s increases in order to explain the relationship between imitation intensity and inequality of influence (see Appendix B for our reason for choosing the Gini coefficient for this purpose). Although the distribution of influence is actually discrete in our model, for simplicity, we pursue this problem by regarding it as a continuous distribution. The Gini coefficient for continuous distributions is defined as follows:

$$\begin{aligned} G=1-2\int _0^1 {L({q}')\,\mathrm{d}{q}'}, \end{aligned}$$
(10)

where we define \(q'\equiv 1-q\). That is, the q-quantile earliest independent voter is the \(q'\)-quantile latest independent voter. Moreover, \(L(q')\) is the Lorenz function, which represents the proportion of the followers of 0-quantile to \(q'\)-quantile independent voters [26] (see also [16]). If all voters have the same numbers of followers, the graph of the Lorenz function (called Lorenz curve) lies on the diagonal. Thus, Eq. (10) suggests that the Gini coefficient is obtained by doubling the area between the diagonal line and the Lorenz curve. A possible value of G ranges from 0 to 1. \(G=0\) means perfect equality (i.e., all voters have the same number of followers), and \(G=1\) means perfect inequality (i.e., a single voter is followed by all the remaining voters).

From Eq. (9), we have the Lorenz function as:

$$\begin{aligned} L({q^{{\prime }}})= & {} \frac{\int _0^{q^{{\prime }}}{({1-s})^{1-\xi }\mathrm{d}\xi } }{\int _0^1 {({1-s})^{1-\xi }\mathrm{d}\xi } }=\frac{( {1-s})^{1-q^{{\prime }}}-({1-s})^{1}}{({1-s})^{0}-({1-s})^{1}} \nonumber \\= & {} \frac{({1-s})\{{({1-s})^{-q^{{\prime }}}-1}\}}{s}. \end{aligned}$$
(11)

Therefore, from Eq. (10), we have the Gini coefficient G as a function of the imitation rate s as

$$\begin{aligned} G(s)= & {} 1-2\int _0^1 {\frac{(1-s)\{(1-s)^{-{q}'}-1\}}{s}}\, \mathrm{d}{q}'\nonumber \\= & {} 2\left[ \frac{1}{s}+\frac{1}{\log (1-s)}-\frac{1}{2}\right] . \end{aligned}$$
(12)

We see that \(G(s)\rightarrow 0\) as \(s\rightarrow 0\) and that \(G(s)\rightarrow 1\) as \(s\rightarrow 1\). We can also prove the monotonicity of the Gini coefficient as the function of imitation rate s (see Appendix C for the proof).

Figure 4 indicates that the earlier a voter makes a decision, the stronger influence that he obtains. It also shows that the imitation rate s increases the inequality of influences. Gini coefficients in Fig. 5 demonstrate that this inequality worsens majority vote accuracy.

Fig. 5
figure 5

Relationship between the disparity in influences and the majority vote accuracy. The horizontal axis stands for the Gini coefficient, which is obtained from the distribution of influences among independent voters, with s varied from 0.01 to 0.99 at an interval of 0.01. The vertical axis indicates the majority vote accuracy. The solid line is the best fit from the single regression, whose coefficient is 0.9941. We performed \(10^{4}\) simulation runs for each value of s. Group size \(N=51\). Competence \(p=0.6\)

4 Discussion and concluding remarks

This study models the sequential decision-making situation wherein each voter makes a decision at a different time referring to a predecessor’s decision, investigates factors that worsen the majority vote accuracy, and evaluates to what degree those factors are responsible. As a result, we find that the deterioration of majority vote accuracy is attributed to two types of mechanisms. First, majority vote accuracy deteriorates as the number of imitators increases, as Fig. 3 illustrates. To understand this process intuitively, let us consider an extreme situation, where a single independent voter is imitated by all the remaining voters. According to Condorcet’s jury theorem, majority vote accuracy increases as the group size increases. Nevertheless, in this case, the collective decision is the same as a single voter’s decision, and therefore the group loses the advantage of size. This outcome is compatible with [9], who assumed simultaneous correlated voting, although our results differ quantitatively from theirs, as mentioned in Sect. 3.1.

Second, the disparity of influence among independent voters worsens the majority vote accuracy, as Figs. 4 and 5 illustrate. Let us consider the following situation to explain this mechanism. There are three voters making decisions independently. Let us assume that the weights for the three votes are 100, 2, and 1, respectively. The difference among the weights for their votes corresponds to the inequality of influences in our model. Although there are actually three voters, this situation is the same as the case wherein only the first voter casts a vote. That is, because we adopt the simple majority rule, the last two voters cannot affect the collective decision, and the first voter’s single vote represents a collective decision. Clearly, the probability of a single voter making an incorrect decision is higher than that of the unweighted majority votes by three voters. Again, this group does not take advantage of its size. This outcome is related to the study by [27], which addressed the optimal weighted majority rule. They proved that the weighted majority voting system in which the weight for the vote by individual i with competence \(p_{i}\) is \(\hbox {log}[p_{i}/(1-p_{i})]\) maximizes the majority vote accuracy under the assumption that all competences are larger than one-half. Therefore, under the assumption of equally skilled individuals, as in the present study, the majority rule of distributing weights equally to everyone is the optimal weighted majority rule. The inequality of influences represents the deviation from the optimum. In summary, as the imitation rate s increases, the first mechanism is driven, and then the number of independent voters decreases. Further, the inequality of influences among independent voters is generated with increasing s. We have evaluated to what extent the combination of these two mechanisms worsens the majority vote accuracy by using the novel criterion called effective group size.

To highlight the second mechanism, Appendix D compares the majority vote accuracies of two models, which generate the same frequency of independent voters when all voters reach decisions. One is our Model 3, which is a sequential decision-making model. The other is the Model 1 in [9], which describes a simultaneous decision-making situation. We find that the two models generate different majority vote accuracies. This implies that in considering the group decision accuracy, not only the frequency of independent voters but also the process through which the frequency is determined or the circumstance in which voters are involved should be taken into account.

Several theoretical studies have previously investigated weighted voting (e.g., [27]). However, in these studies, weights are determined exogenously, and the researchers do not discuss how the weights are determined through social interactions. In contrast, the current study investigates how the distribution of influences, which is qualitatively equivalent to the distribution of weights, generated endogenously in the process of sequential decision-making affects majority vote accuracy. Consequently, we demonstrate that the distribution of influences among independent voters follows an exponential distribution (Eq. (9)). In Appendix E, we show that the conditional distribution of influences among all voters given the voter is an independent voter follows a power-law relationship (see Eq. (32)).

Our model might be too simple to explain the micro-foundation that worsens group decision accuracy. For example, in the present model, voters’ competence is assumed to be homogeneous. However, in the real world, individuals with different competences make collective decisions. The assumption of heterogeneous competence raises a new research question (cf., [28]). Studying how the correlation between a voter’s competence and imitation rate affects majority vote accuracy would be interesting. Moreover, we treat naïve imitations, whereas in the sequential decision-making model by [7] each individual makes a decision by calculating the probability of each option to be correct based on the observation of all predecessors’ actions. Future studies should investigate how different micro-level decision-making processes affect group decision accuracy by incorporating empirical studies. Last, our idea of tracing the path of imitation and then deriving the disparity of influence requires only the imitation rate s, and does not the distribution of competences. Therefore, our idea is expected to be a useful tool for future works.