In [10] we introduced two closely-related approaches to model election dynamics: one is called an election-microstructure approach, or a structural approach, which encapsulates structural details in a voting scenario; and the other is called a representative voter approach, or a reduced-form approach, which captures qualitative features of election dynamics without the specification of structural details. Let us begin by briefly describing these approaches.
In the structural approach, one considers a set of issues that are of concern to the electorates. These may include, for instance, the candidates’ positions on social welfare, immigration, abortion, climate policy, gun control, healthcare, and so on. The l-th candidate’s position on the k-th issue will then be represented by \(X^l_k\), whose value is not necessarily apparent to the voters. The uncertainties for the factor \(X^l_k\) thus make it a random variable on a suitably defined probability space equipped with the ‘real-world’ probability measure \({{\mathbb {P}}}\). The idea is that, for instance, if the k-th factor were concerned with, say, climate policy, then in a simple model setup \(X^l_k=0\) would represent the situation in which candidate l is against implementing policies to counter climate change, while \(X^l_k=1\) would represent the situation in which candidate l is for implementing such policies. Not all factors need to be binary, of course, but at any rate the voters are not certain about the positions of the candidates on these issues, if they were elected.
The voters, however, possess partial information concerning the values of these factors, and as they learn more about the candidate, or perhaps about the political party to which the candidate belongs, their best estimates for these factors will change in time. The voters also have their preferences: what is a desirable position of a candidate on a given issue to one voter may well be undesirable to another voter. Let us denote by \(w^k_n\) the preference weight of the n-th voter for the k-th factor, which may be positive or negative, depending on the voter’s position on that issue. Then writing \({{\mathbb {E}}}_t[X^l_k]\) for the expectation under the probability measure \({{\mathbb {P}}}\) conditional on the information available at time t, the score at time t assigned to the l-th candidate by the n-th voter, under the assumption of a linear scoring rule, is given by the sum
$$\begin{aligned} S^l_n(t) = \sum _{k} w^k_n\, {{\mathbb {E}}}_t[X^l_k]\, . \end{aligned}$$
(1)
In particular, the voter will select the candidate with the highest score at time \(T\ge t\) when the election takes place. Thus by modelling the flow of information associated with each of the factors, we arrive at a rather detailed dynamical model for election, and this is the basis of the structural approach.
In the present paper we shall be concerned primarily with the reduced-form approach and develop the theory in more detail. We consider an election in which there are N candidates. In the reduced-form approach, the voters are in general not fully certain about which candidate they should be voting for, but they have their opinions based on (1) the information available to them about the candidates, or perhaps about the political party to which they belong, and (2) the voter preferences. The diverse opinion held by the public can then be aggregated in the form of a probability distribution, representing the public preference likelihoods of the candidates. Thus we can think of an abstract random variable X taking the value \(x_i\) with the a priori probability \({{\mathbb {P}}}(X=x_i)=p_i\) defined on a probability space, where \(x_i\) represents the i-th candidate, and the a priori probability \(p_i\) represents today’s opinion-poll statistics.
Today’s public opinion, however, changes over time in accordance with the unravelling of new information relevant to the election. Hence the a priori probability will be updated accordingly, generating a shift in the opinion-poll statistics. To model this dynamics, let us assume (merely for simplicity) that reliable knowledge regarding which candidate represents the ‘best choice’ increases linearly in time, at a constant rate \(\sigma \). There is also a lot of rumours and speculations that obscure the reliable information in the form of noise. This uncertainty, or noise, will be modelled by a Brownian motion, denoted by \(\{B_t\}\), which is assumed to be independent of X because otherwise it cannot be viewed as representing pure noise. Hence, under these modelling assumptions the flow of information, which we denote by \(\{\xi _t\}\), can be expressed in the form
$$\begin{aligned} \xi _t = \sigma X t + B_t . \end{aligned}$$
(2)
For the voters, the quantity of interest is the actual value of X, but there are two unknowns, X and \(\{B_t\}\), and only one known, \(\{\xi _t\}\). In this situation, a rational individual considers the probability that \(X=x_i\) conditional on the information contained in the time series \(\{\xi _s\}_{0\le s\le t}\) gathered up to time t.
We proceed to determine the conditional probability \({{\mathbb {P}}}(X=x_i|\{\xi _s\}_{0\le s\le t})\). We begin by remarking that the information process \(\{\xi _t\}\) is Markovian. An intuitive way of seeing this is to observe that the increment \(\mathrm{d}\xi _t\) of \(\{\xi _t\}\) is given by the sum of \(\sigma X \mathrm{d}t\) and \(\mathrm{d}B_t\), but the coefficient of \(\mathrm{d}t\) is constant in time, while the Brownian motion has independent increments, so the process \(\{\xi _t\}\) of (2) does not carry any ‘memory’. Establishing the Markov property is equivalent to showing that \( {{\mathbb {P}}}(\xi _t\le x | \xi _s,\xi _{s_1},\xi _{s_2},\ldots ,\xi _{s_k}) = {{\mathbb {P}}}( \xi _t\le x|\xi _s)\) for any collection of ordered times \(t, s, s_1, s_2, \ldots ,s_k\). However, the process \(\{\xi _t\}\) conditional on \(X=x_i\) is just a drifted Brownian motion, which clearly is Markovian, so we have
$$\begin{aligned} {{\mathbb {P}}}\left( \xi _t\le x| \xi _s,\xi _{s_1}, \ldots ,\xi _{s_k}\right)= & {} \sum _i {{\mathbb {P}}}\left( \xi _t\le x| X=x_i, \xi _s,\xi _{s_1}, \ldots ,\xi _{s_k}\right) {{\mathbb {P}}}(X=x_i) \nonumber \\= & {} \sum _i {{\mathbb {P}}}\left( \xi _t\le x| X=x_i,\xi _s\right) {{\mathbb {P}}}(X=x_i) \nonumber \\= & {} {{\mathbb {P}}}\left( \xi _t\le x\Big | \xi _s\right) . \end{aligned}$$
(3)
In addition to the Markovian property, the random variable X is a function of the time series \(\{\xi _t\}\) because, with probability one we have
$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\xi _t}{\sigma t} = X . \end{aligned}$$
(4)
It follows that the conditional probability \({{\mathbb {P}}}(X=x_i|\{\xi _s\}_{0\le s\le t})\) simplifies to \({{\mathbb {P}}}(X=x_i|\xi _t)\).
The logical step of converting the prior probabilities \({{\mathbb {P}}}(X=x_i)\) into the posterior probability \({{\mathbb {P}}}(X=x_i|\xi _t)\) is captured by the Bayes formula:
$$\begin{aligned} {{\mathbb {P}}}(X=x_i|\xi _t)= & {} \frac{{{\mathbb {P}}}(X=x_i)\rho (\xi _t|X=x_i)}{\sum _{j} {{\mathbb {P}}} (X=x_j) \rho (\xi _t|X=x_j)} . \end{aligned}$$
(5)
Here the conditional density function \(\rho (\xi _t|X=x_i)\) for the random variable \(\xi _t\) is defined by the relation
$$\begin{aligned} {{\mathbb {P}}}\left( \xi _t\le y|X=x_i\right) =\int _{-\infty }^y \rho (\xi |X=x_i)\,\mathrm{d}\xi , \end{aligned}$$
(6)
and is given by
$$\begin{aligned} \rho (\xi |X=x_i)=\frac{1}{\sqrt{2\pi t}} \exp \left( - \frac{(\xi -\sigma x_i t)^2}{2t}\right) . \end{aligned}$$
(7)
This follows from the fact that, conditional on \(X=x_i\), the random variable \(\xi _t\) is normally distributed with mean \(\sigma x_i t\) and variance t. We thus deduce that
$$\begin{aligned} {{\mathbb {P}}}(X=x_i|\xi _t) =\frac{p_i\exp \left( \sigma x_i \xi _t- \frac{1}{2} \sigma ^2 x_i^2 t\right) }{\sum _j p_j \exp \left( \sigma x_j \xi _t-\frac{1}{2} \sigma ^2 x_j^2 t\right) } . \end{aligned}$$
(8)
Inferences based on the use of (8) are optimal in the sense that they minimise the uncertainty concerning the value of X, as measured by the variance or entropic measures subject to the information available. Thus the a posteriori probabilities (8) determine the best estimate for the unknown variable X, in the sense of minimising the error.
In the reduced-form approach it is the conditional probability (8), which is a nonlinear function of the model input \(\{\xi _t\}\), that models the complicated dynamics of the opinion poll statistics. Our objective in this paper is to investigate various properties of the model, as well as to explore different ways in which the model can be exploited.