Skip to main content

Ideological extremism and primaries


This paper is the first one to present a model of primaries with endogenous party affiliations. I show that closed primaries (where only affiliated party members can vote) result in more charismatic candidates than open primaries. This occurs because, in equilibrium, closed-primary voters care more about winning and therefore they are more willing to trade off their ideologically preferred candidate for one who is more likely to win, i.e., a more charismatic one. I also show that under open primaries, the party leaders have higher incentives to choose more extreme platforms. As a consequence, open-primary nominees are more likely to be extremists than closed-primary ones—which is consistent with the most recent empirical evidence. Finally, I show that, if instead of organizing primaries, party leaders were to handpick the nominees, the candidates would be even more moderate and more charismatic.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. See for a political movement that endorses open primaries, with a reasoning similar to this quote by A. Maldonado, Lieutenant Governor in California (April 29, 2010): “If you want to win a close primary on the Republican side, you have to veer hard to the right, and if you want to win a Democrat primary, you veer hard to the left. In the middle, where you have independents and decline-to-states, guess what they have to do in California? They have to ask for permission of a party to participate in a primary election.”.

  2. For instance, Gerber and Morton 1998; Mcghee et al. 2013) have shown that primaries’ openness may lead to more extreme candidates than closed ones. Gerber and Morton 1998 show this finding in Tables 3 and 4 in their paper, and Mcghee et al. (2013) highlight that, if anything, their results contradict the common wisdom: “...more open primaries [end up] electing legislators who are more extreme...”’ (page 2 in their paper).

  3. Hamermesh (2006), Berggren et al. (2010) and Lenz and Lawson (2011) show that candidate’s beauty has a positive effect on electoral results. Moreover, our results hold if charisma is a trait that only affects voters in the general election (possibly less sophisticated and easier to sway with non-policy attributes) but not the primary voters (more sophisticated). Although related to valence, it is worth emphasizing that the modeling assumptions impose a narrower interpretation than the original one by Stokes (1963). An example of some individual characteristics that do not fit the model because they are intertwined with the policy-making, but are sometimes thought of valence would be incumbency (Stone and Simas 2010), character (a la Callander and Wilkie 2007 or Kartik and McAfee 2007) or quality (Caillaud and Tirole 1999, 2016).

  4. In this regard, it is “deterministic” membership, as opposed to Gomberg et al. (2016).

  5. With respect to ideology, the underlying reasoning is that the cost of interacting with other party members is increasing in the disagreement, i.e., in the ideological distance. With respect to charisma, within the scope of the paper, Mattozzi and Merlo (2008) and Berggren et al. (2010) provide some related examples that support the assumption. Beyond the scope of politics, it has been shown that some observables—like beauty—have positive effects on labor market outcomes, independently of whether they are productivity-enhancing, e.g., Mobius and Rosenblat (1966).

  6. Please note that this is just a salient example, and that the aim of the paper is, by no means, to explain or fit the 2016 US elections.

  7. The most recent empirical evidence that shows that open primaries, if any, lead to more extreme candidates looks at state legislators. We believe that candidates’ ideologies in primaries in subnational elections do not need to be known to the general public. Hence, we focus on this “low-information” environment.

  8. The current setup is a low-information environment regarding ideological positions. Consistently with Snyder and Ting (2002) and references therein, voters cannot distinguish between conservative and liberal candidates within a party, but they can use party labels to distinguish between candidates position overall. Thus, in the main model, I make the assumption that candidates’ ideology is private information. In Sect. 7.2 in “Appendix,” I relax this assumption and I obtain similar results.

  9. Notice that, given \(B(\delta _i,c_i)\), Assumption 2 is a condition on the platforms. However, keeping the platforms fix, it could be reinterpreted as a condition on \(B(\delta _i,c_i)\), for instance, a very negative \(B_\delta \).

  10. For instance, charisma affects the votes that a candidate gets but is unrelated to policy. See Berggren et al. (2010), Hamermesh (2006), Lenz and Lawson (2011) and Lawson et al. (2010) for some examples and empirical evidence.

  11. In accordance with the literature of valence, charisma could be additionally included as a valence term in the utility function, and all my results would hold, as long as charisma has a tiny additional effect on the candidates’ electability. In light of that literature, my model would address the issue of what would happen if charisma had this additional electoral effect, not included in the utility. That is, what if voters are more likely to vote for a good looking candidate even if good looks do not have an effect on their payoff?

  12. If those voters exist, it would be optimal for them to choose a lower level of charisma and, therefore, lower levels of uncertainty (in case that party actually wins).

  13. More generally, let \(B(\delta _i,c_i)=b(c_i)-|x_i-x|^{\beta }\). The variance is convex in \(c_i\) if and only if

    $$\begin{aligned} \left( \frac{2}{\beta }-1\right) (b_c)^2+b_{cc}b(c)>0, \end{aligned}$$

    which always holds for \(0<\beta <2\) and \(b_{cc}>0\), or for \(0<\beta <2\) and \(b_{cc}<0\) with \(b_{cc}\) large (i.e., close to 0).

  14. Interchangeably, I call this voter the nominator, the decisive voter or the primary’s median voter.

  15. Notice that these are the ideal textbook cases; in Sect. 4.3 we consider a continuum of types of primaries, between the open and closed.

  16. In Latin America there is a large variety of arrangements for primaries, as described in Slough et al. (2019).

  17. Whether a primary is more or less open depends on the affiliation and voting requirements of each party.

  18. This classification of the rules depends on the decisive voter, whose location could depend on the platforms’ choice. Since there is no model for such relationship when the primaries are not open or closed, we simply restrict to the cases in which the ex-ante order of openness (without endogenizing the platforms) coincides with the ex-post order. There is a large family of rules for which this restriction holds: two realistic and simple examples are to have the decisive voter’s location proportional to the distance to 0, or fixed. A broader discussion on the internal organization of parties and the endogenization of platforms can be found in Ansolabehere et al. (2012).

  19. When the benefits of affiliation increase with charisma (as it follows from Assumption 1), charismatic citizens who are far from the party are more attracted to it. Hence, the larger the returns to charisma (\(B_c\)), the noisier the signal of charisma (i.e., larger variance for the same level of charisma).

  20. Note that when the benefits of affiliation increase with charisma (as it follows from Assumption 1), charismatic citizens who are far from the party are more attracted to it. Hence, the larger the returns to charisma (\(B_c\)), the noisier the signal of charisma (i.e., larger variance for the same level of charisma).

  21. The two most widely quoted criteria for a democracy are due to Dahl (1989): (1) effective participation and (2) voting equality at the decisive stage. Furthermore, “Genuine democratic elections serve to resolve peacefully the competition for political power within a country and thus are central to the maintenance of peace and stability.”, from the Declaration of Principles for International Election Observation, endorsed by the UN and various organizations such as “The Carter Center”.

  22. See Aragon (2013) or Serra (2011), for related literature.

  23. I thank an anonymous referee for pointing this out.

  24. Similarly, if the party leaders were more extreme, the median voter’s incentives to increase charisma would be very steep under relatively closed primaries because of the extreme polarization. Hence, the variance becomes so large under closed primaries that open primaries result in greater welfare.

  25. I thank an anonymous referee for this idea.

  26. In this new setup, two affiliated citizens equally distant from \(\pi _P\) have the same expected charisma: the one closer to the median is identical to one on the other side of the party but more likely to win; hence, in equilibrium, the one further away is never nominated.

  27. They would still be risk-averse on ideologies, but now there is no incomplete information on that dimension.


Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Agustin Casas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

I am indebted to David Austen-Smith, for his patience and guidance. I also benefited from conversations with Tim Feddersen, Bard Harstad, Antoine Loeper, Steve Callander, Alessandro Pavan, Andre Trindade, Carlo Prato, William Minozzi, Giorgio Zanarone, Galina Zudenkova, Martin Gonzalez Eiras and my colleagues at Northwestern University and Universidad Carlos III. Casas thanks the support of the Spanish Research Agency through Grant ECO2017-85763-R (AEI/FEDER, UE). All remaining errors are mine.

7 Appendix

7 Appendix

1.1 7.1 Main results

Proof (Proof of Lemma 1)

For an interior equilibrium, from the first derivative of the expected utility in Eq. 11, I obtain

$$\begin{aligned} \frac{\partial P_L}{\partial c_{L}}\Pi _{i}(c_L,c_R)+P_L \frac{\partial \Pi _i(c_L,c_R)}{\partial c_L}&=0\nonumber \\ \frac{\omega -\frac{\partial V(c_L)}{\partial c_L}}{2\alpha }\Pi _{i}(c_L,c_R)-P_L\frac{\partial V(c_{L})}{\partial c_{L}}&=0 \end{aligned}$$

Hence, the second derivative can be written as

$$\begin{aligned} \frac{\partial {\text {FOC}}}{\partial c_L}&=2\frac{\partial P_L}{\partial c_{L}}\frac{\partial \Pi _{i}(c_L,c_R)}{\partial c_L}+P_L\frac{\partial ^2\Pi _i(c_L,c_R)}{\partial c_L^2}+\Pi _i\frac{\partial ^2P_L(c_L,c_R)}{\partial c_L^2} \\&=-2\frac{\omega -\frac{\partial V(c_L)}{\partial c_L}}{2\alpha }\frac{\partial V(c_L)}{\partial c_L}-P_L\frac{\partial ^2 V(c_{L})}{\partial c_{L}^2}-\frac{\Pi _i}{2\alpha }\frac{\partial ^2 V(c_{L})}{\partial c_{L}^2}, \end{aligned}$$

which, for convenience, I write as

$$\begin{aligned} \frac{\partial {\text {FOC}}}{\partial c_L}=-\frac{\omega -V'}{\alpha }V'- V''\left( P_L+\frac{\Pi _i}{2\alpha }\right) , \end{aligned}$$

and satisfies the SOC if and only if

$$\begin{aligned} V''\ge \frac{-2(\omega -V')V'}{2\alpha P_L +\Pi _i} \end{aligned}$$

In particular, for all \(V''\ge 0\) the SOC is satisfied. Notice that if \(V''<0\), hence there is no interior \(c_L\) that maximizes the probability of winning, i.e., everybody would choose \(c^{*}={\bar{c}}\). And if \(V'<0\), everybody would choose \(c^*=0\). \(\square \)

Proof (Proof of Proposition 1 and Corollary 1)

In the proof of Lemma 1 it has been shown that the objective function is strictly concave, and therefore, there is an unique interior equilibrium. Hence, I need to show that \(\frac{\partial c^*_p}{\partial d_{P}}\) has the right sign; i.e., negative for \(p=L\) and positive for \(p=R\). Proving it for one of the two cases is enough, so for consistency, I show it for \(p=L\). Using the implicit function theorem, it is enough to show that the cross derivative of the objective function is negative. That is, the partial derivative of the FOC, Eq. 18 (in the proof of Lemma 1), with respect to \(d_L\) must be negative.

$$\begin{aligned} \text {sign}\left( \frac{\partial c^*_L}{\partial x_d}\right)&=\text {sign}\left( \frac{\partial {\text {FOC}}}{\partial x_d}\right) \\&=\text {sign}\left( \frac{\omega -\frac{\partial V(c_L)}{\partial c_L}}{2\alpha }\frac{\partial \Pi _{i}(c_L,c_R)}{\partial d_L}\right) \\&=\text {sign}\left( \frac{\omega -\frac{\partial V(c_L)}{\partial c_L}}{2\alpha }2(l-r)\right) . \end{aligned}$$

In the equation, \(\omega -\frac{\partial V(c_L)}{\partial c_L}>0\) follows from \(c<{\bar{c}}\), and \(l-r<0\) from the assumption that \(l<0<r\). Hence, \(\frac{\partial c^*_L}{\partial d_L}<0\).

Following the same reasoning, the sign of the derivative of Eq. 18 the parties’ platforms will determine the sign of \(\frac{\partial c^*_p}{\partial \pi _p}\). First, notice that the corollary holds in the symmetric case where \(r=-l\). Hence,

$$\begin{aligned} \text {sign}\left( \frac{\partial c^*_L}{\partial l}\right) |_{r=-l}&=\text {sign}\left( \frac{FOC}{\partial l}\right) |_{r=-l}\\&=\text {sign}\left( \frac{\omega -\frac{\partial V(c_L)}{\partial c_L}}{2\alpha }\frac{\partial (4 d_{L}l)}{\partial l}-V'\frac{\partial P_L}{\partial l}\right) \\&=\text {sign}\left( \frac{\omega -\frac{\partial V(c_L)}{\partial c_L}}{\alpha }2 d_{L}-0\right) . \end{aligned}$$

And since the decisive voter in a left primary has \(d_{L}\le 0\), \(\frac{\partial c^*_L}{\partial l}|_{r=-l}\) is negative. Therefore, charisma and policy uncertainty increase with polarization. \(\square \)

Proof (Proof of Proposition 2 and 3)

Notice that, formally, this proof is for Proposition 3. But Proposition 2 is included in the latter, so I omit its proof. Let \(c^{n}_P\) refer to the charisma of party P under nomination rule n. The function below implicitly defines party L’s best response as a function of the charisma of the other party R and the platforms for the general case.

$$\begin{aligned} V'(c^{n}_L)=\frac{\partial V(x_L|c^{n}_L)}{\partial c^n_L}=\frac{\omega \Pi ^n_{d_L}(l(n),r(n),c_L^n,c_R^n)}{2\alpha P_L(l(n),r(n),c_L^n,c_R^n)+\Pi ^n_{d_L}(l(n),r(n),c_L^n,c_R^n)}. \end{aligned}$$

Taking into account the reaction functions, for a given nomination rule n, the party leaders choose platforms that maximize their expected utility:

$$\begin{aligned} \max _{l(n)}&P_{L}\left[ u^e_{z_{L}}(l(n),c_L|n)-u^e_{z_{R}}(r(n),c_R|n)\right] +u^e_{z_{R}}(c_R,r(n)|n) \end{aligned}$$

Let \(\Pi _{i}\equiv \left[ u^{e}_{i}(c_L)-u^{e}_{i}(c_R)\right] \). The maximization problem can be rewritten in terms of the expected policy gain of the party leader with ideology \(z_L\) when the nomination rule n is used, \(\Pi _{z_{L}}^{n}\),

$$\begin{aligned} \max _{l(n)}&P_{L}\Pi ^n_{z_{L}}(l(n),r(n),c_L,c_R)+u^e_{z_{L}}(r(n),c_R|n), \end{aligned}$$

The FOC for L is

$$\begin{aligned} \frac{\partial P_{L}(c_L^n,c_R^n)}{\partial l(n)}\Pi _{z_{L,n}}(c_L^n,c_R^n)+ P_{L}(c_L^n,c_R^n)\frac{\partial \Pi _{z_{L,n}}(c_L^n,c_R^n)}{\partial l(n)}+\frac{\partial u^e_{z_{L}}(r(n),c_R|n)}{\partial l(n)}=0. \end{aligned}$$

The last term indicates that the party leader also takes into account the effect on his payoff under the policies of the other party. In order to be precise, I write down the partial derivatives separately, for the general case, and later on I plug them to obtain the best responses and the symmetric equilibrium.

$$\begin{aligned} \frac{\partial P_{L}(l(n),r(n),c_L^n,c_R^n)}{\partial l(n)}&=\left[ \omega \left( \frac{\partial c_L^n}{\partial l(n)}-\frac{\partial c_R^n}{\partial l(n)}\right) {+}\frac{\partial V(c_R^n)}{\partial l(n)}-\frac{\partial V(c_L^n)}{\partial l(n)}-2l^n\right] /2\alpha \\ \frac{\partial \Pi _{z_L}(l(n),r(n),c_L^n,c_R^n)}{\partial l(n)}&=2(z_L-l^n)+\frac{\partial V(c_R^n)}{\partial l(n)}-\frac{\partial V(c_L^n)}{\partial l(n)}\\ \frac{\partial u^e_{z_{L}}(r(n),c_R|n)}{\partial l(n)}&=-\frac{\partial V(c_R^n)}{\partial l(n)} \end{aligned}$$

Plugging the equations in the FOC, we obtain L’s best response. But since we are focusing on the symmetric equilibrium, we can impose \(z_L=-z_R\), \(l(n)=-r(n)\), \(\frac{\partial c_L^n}{\partial l(n)}=\frac{\partial c_R^n}{\partial l(n)}\) and \(\frac{\partial V(c^n_{R})}{\partial c_R^n}=\frac{\partial V(c^n_{L})}{\partial c_L^n}\). Moreover,

$$\begin{aligned} \frac{\mathrm{d}V^n_{R}}{\mathrm{d}l(n)}=\frac{\partial V(c^n_{R})}{\partial c_R^n}\frac{\partial c_R^n}{\partial l(n)}=\frac{\partial V(c^n_{L})}{\partial c_L^n}\frac{\partial c_L^n}{\partial l(n)}=\frac{\partial V^n_{L}}{\partial l(n)}. \end{aligned}$$

And so the equations above simplify to

$$\begin{aligned} \frac{\partial P_{L}(l(n),r(n),c_L^n,c_R^n)}{\partial l(n)}&=-2l(n)/2\alpha \\ \frac{\partial \Pi _{z_L}(l(n),r(n),c_L^n,c_R^n)}{\partial l(n)}&=2(z_L-l(n)) \end{aligned}$$

Therefore, to obtain the symmetric equilibrium, we can look at the following equation

$$\begin{aligned} 0&=-\frac{l(n)}{\alpha }\Pi ^n_{z_{L}}+(z_L-l(n))-\frac{\partial V^n_L}{\partial l(n)} \end{aligned}$$

At the symmetric equilibrium \(\Pi ^n_{z_{L}}=4z_L l(n)\); hence, the equilibrium condition can be rearranged into

$$\begin{aligned} \frac{4z_l (l(n))^2}{2\alpha }+l(n)=z_L-\frac{\partial V^n_L}{\partial l(n)}. \end{aligned}$$

The left-hand side is increasing in \(l^n\), while the right-hand side is decreasing in \(\frac{\partial V^n_L}{\partial l(n)}\). So the larger is the derivative of the variance, the smaller is \(l^n\). Hence, \(l(\text {open})<l(\text {closed})\) if and only if \(\frac{\partial V^{\text {open}}_L}{\partial l(\text {open})}>\frac{\partial V^{\text {closed}}_L}{\partial l(\text {closed})}\). This condition holds trivially when the population-wide median is the open primaries median because \(\frac{\partial V^{\text {open}}_L}{\partial l(\text {open})}=0\). \(\square \)

The proof above proves the condition on the variance as a general statement. Below we show that such condition always holds for the assumption on \(B(\delta _i,c_i)\).

Proof (Proof of Example in Proposition 3)

In order to prove that the condition in Proposition 3 always holds for \(B(\delta _i,c_i)=c_i^2-(x_i-l(n))^2\) and \(\alpha \ge 0.75\omega ^2\), we need \(\frac{\partial ^{2}V}{\partial l\partial b}<0\). We begin by total differentiating Eq. 19:

$$\begin{aligned} \mathrm{d}l\left( \frac{\Pi _z}{\alpha }+1+\frac{\partial ^{2}V}{\partial l(n)^{2}}\right) +\mathrm{d}\omega \frac{\partial ^{2}V}{\partial l(n)\partial b}=0. \end{aligned}$$

Solving for \(\frac{\partial V}{\partial l(n)}\) we obtain

$$\begin{aligned} \frac{\partial V}{\partial l(n)}=\frac{3}{2}\omega ^2\alpha \frac{bl(n)\Pi _{d(n)}}{(\Pi _{d(n)}+\alpha )^3}. \end{aligned}$$

Thus, \(\frac{\partial ^{2}V}{\partial l(n)\partial b}=6\alpha \omega ^2b l(n)^3\frac{2\alpha -\Pi _{d(n)}}{(\Pi _{d(n)}+\alpha )^4}\). And \(\frac{\partial ^{2}V}{\partial l(n)^{2}}\ge 0\). Then it follows that

$$\begin{aligned} \text {sign}\left( \frac{\partial l(n)}{\partial b}\right) =-\text {sign}\left( \frac{\partial ^{2}V}{\partial l(n)\partial b}\right) =\text {sign}gn(2\alpha -\Pi _{d(n)}). \end{aligned}$$

Thus, for all \(\alpha \ge \Pi _{d(n)}/2\) or \(1\ge \frac{4l(n)^2}{2\alpha }\) the condition holds. From Equation 19 we obtain \(z_L\frac{4l(n)^2}{2\alpha }+l(n)=z_L-\frac{\partial V}{\partial l(n)}\). Thus, if \(\frac{z_L-l(n)-\frac{\partial V}{\partial l(n)}}{z_L}\le 1\), the condition holds. That is for \(z_L-l(n)-\frac{\partial V}{\partial l(n)}\ge z_L\), which always holds because \(l<0\) and \(\frac{\partial V}{\partial l(n)}<0\). \(\square \)

Proof (Proof of Example in Corollary 3)

Due to symmetry, wlog we focus on party L, and hence, all party subscripts are dropped. Also, let \(B(\delta _i,c_i)=c_i^2-(x_i-l(n))^2\) and \(\alpha \ge 0.75\omega ^2\). Solving for optimal charisma and platforms, we obtain \(V(c(l(n)))=\frac{3\omega ^2}{4}(\frac{\Pi _{d(n)}}{\Pi _{d(n)}+\alpha })^2\), and so,

$$\begin{aligned} \frac{\partial V}{\partial l(n)}=\frac{3}{2}\omega ^2\alpha \frac{bl(n)\Pi _{d(n)}}{(\Pi _{d(n)}+\alpha )^3}. \end{aligned}$$

Here we want to prove that \(\exists ! {\bar{\omega }}\ge 0\) such that \(\forall \omega >{\bar{\omega }}\ge 0\), \(\Omega ^o<\Omega ^c(\omega )\). The following two intermediate results are useful to facilitate the proof of the statement in Example 3. \(\square \)

Note that \(\frac{\partial l}{\partial \omega }\ge 0\) for all \(\omega \ge 0\) follows from total differentiation of \(l^c\) in Eq. 19. That is, \(\frac{\mathrm{d}l^c}{\mathrm{d}\omega }(\Pi _z/\alpha +1+\frac{\partial ^{2}V}{\partial (l^c)^{2}})=-\frac{\partial ^{2}V}{\partial \omega \partial (l^c)}\), where the LHS and the RHS are positive because \(\frac{\partial ^{2}V}{\partial (l^c)^{2}}\ge 0\) and \(\frac{\partial ^{2}V}{\partial \omega \partial (l^c)}\le 0\), respectively.

Lemma 2

\(\displaystyle \lim _{\omega \rightarrow \infty } V(c^n_P)=0\) and \(\displaystyle \lim _{\omega \rightarrow \infty }l^2(\texttt {n})=0\).


We plug \(\frac{\partial V}{\partial l(n)}\) in Eq. 19 and we divide both sides by \(\omega ^2\). Thus,

$$\begin{aligned} -\alpha \left[ \frac{3}{2}\frac{l(2l)^2}{(4l^2+\alpha )^3}\right] =\frac{l}{\omega ^2}-\frac{z}{\omega ^2}+\frac{4z}{2\alpha }\frac{l^2}{\omega ^2}. \end{aligned}$$

Taking into account \(\frac{\partial l}{\partial \omega }\ge 0\), when \(\omega \rightarrow \infty \), the RHS goes to 0, and so \(l_{\omega \rightarrow \infty }=0\). Moreover, the LHS goes to 0 at the rate \(\frac{1}{\omega ^2}\). Then, the variance can be rewritten as \(\omega ^2[\frac{3}{2}\frac{l(2l)^2}{(4l^2+\alpha )^3}] \times l(4l^2+\alpha )\), where \(\omega ^2\) and the term in brackets increase and decrease at the same speed with \(\omega \) (i.e., they cancel each other) and \(\displaystyle \lim _{\omega \rightarrow \infty } l(4l^2+\alpha )=0\). \(\square \)

Thus, note that at \(\omega =0\) it must be that \(\Omega ^c=\Omega ^c\). This parity of welfare follows from \(l^c=l^o\) and \(V(\omega =0)=0\). Moreover, from Lemma 1 the variance is bounded by \(\frac{3}{4}\omega ^2\le \alpha \), which implies that it is always lower than \(l(\text {open})^2\) for any \(\alpha \). With \(l(n)^2\) decreasing in \(\omega \) toward 0, then \(V(n)+l(n)^2\) must cross \(l(\text {open})^2\) at some point \({\bar{\omega }}\) such that \(\forall \omega >{\bar{\omega }}\) it holds that \(V(n)+l(n)^2<l(\text {open})^2\), and hence, \(\Omega ^o<\Omega ^n(\omega )\). \(\square \)

1.2 7.2 Robustness

In this section, I show that the main results of the paper hold even under alternative information structures.

Throughout the paper, I assumed that charisma is observed and ideologies are not. In order to address the potential observability of ideologies, here I assume the opposite. That is, I assume that ideology is observed and charisma is not (as in Andreottola 2016). Hence, primary voters nominate candidates according to the pre-candidates’ ideologies. Similarly, general voters vote according to the candidates’ ideology, partisan affiliation and expected charisma.

Let ideologies \(x_{i}\) be observed and charisma \(c_{i}\) be private knowledge, with \(x_i\perp c_i\). Utility is defined as in Eq. 1, and the timing of the game is as before. That is, at \(t=1\) citizens observe platforms and make affiliation decisions according to \(B(c_i,\delta _i)\). At \(t=2\), the primary’s median voter nominate an affiliated member with ideology \(x_P\) and expected charisma \(E(c_P|x_P)\). At \(t=3\), voters observe \((x_L,E(c_L|x_P);x_R,E(c_R|x_R))\), the general election takes place and the winner implements his platform.

For simplicity, let \(B(c_i,\delta _i)=c_i^2+t-(x_i-\pi _p)^2\), with \(t\ge 0\). This function determines the incentives to affiliate to a party P. Citizens far from \(\pi _p\) would only affiliate if they are very charismatic, while citizens relatively close to the platform would affiliate independently of charisma. In sum, a candidate who fits the party’s platform (the relatively extreme one) is expected to be low charisma, which implies a smaller probability of winning.Footnote 26

Voters in the general election care about the candidates’ proposals, but—as before—they are more likely to vote for charismatic ones. Thus, in the last stage of the game, charisma and moderation increase the probability of winning the election. The decisive voter in the primary then faces a trade-off between his ideal candidate (extreme in closed primaries) and charisma. The more extreme a candidate is (that is, closer to the party’s platform), the least he is expected to be charismatic.

This ex-post relationship between ideology and charisma is similar to the main model. When voters observe the ideology and the affiliation of the candidates (i.e., \(x_P\)), they update their beliefs on the conditional density \(c_{i}|x_P\). As a result, more moderate decisive voters (closer to 0) care less about a particular party winning, and so they nominate less charismatic candidates than more extreme decisive voters (closer to the party’s platform).Footnote 27

Intuitively, the more extreme is the decisive voter, the more he cares about winning, and then he is more likely to choose a more charismatic candidate, at the cost of choosing a more moderate candidate (which gives him less utility if he wins). This intuition holds when charisma has a large effect in the probability of winning, relatively to the affiliation.

Lemma 3

For \(\alpha \) large enough, there is a large enough \(\omega \), such that the candidates’ charisma is decreasing in nominator’s ideology. That is, more open primaries lead to nominating less charismatic and more extreme candidates.

For large \(\omega \), the value of signaling a large charisma increases with it, relative to choosing a more moderate policy. Therefore, more extreme nominators choose candidates whose preferred policy is closer to them, and have higher charisma.

The proof of Lemma (3) includes a few steps. First, I show that ideology signals charisma. Second, a primary voter could choose a more charismatic candidate choosing a candidate far from the party’s platform, to the right or the left. In this setup, they choose the more moderate one because the probability of winning increases. Third, I show under which conditions, more extreme nominators are willing to choose more moderate candidates, who are in turn also more charismatic. In this new setup, the endogenous cost of nominating a more charismatic candidate is choosing a candidate who is further away from the nominator’s ideal point.

To show that the necessary monotonicity also holds if the ideologies are observed instead of charisma, it is enough to show that: \(\frac{\partial x_{L}}{\partial d_{L,n}}<0\). First, I show how the affiliation decisions change as I change the previous assumption; second, I show that the result still holds.

If a voter \((x_{i},c_{i})\) affiliates to party L, then \(B(\delta _i,c_i)>0\), i.e., \(b(c_{i})=c^{2}_{i}+t\ge (x_{i}-l(n))^{2}\). Let \(x_{L}\) be the observed candidates’ ideology; and let \(|x_{L}|<|l|\), and the same for R. Then

$$\begin{aligned}E(c_{i}|x_{L})=\int _{c_{i}\ge b^{-1}(x_{l}-l)^{2}}c_{i}\mathrm{d}F(c).\end{aligned}$$

Let \(d_{L,n}\) be the nominator’s ideology from party L; in a symmetric game he maximizes

$$\begin{aligned} \frac{\alpha +x_{R}^{2}-x_{L}^{2}+\omega \left[ E(c_{R}|x_{R})-E(c_{L}|{x_{L}})\right] }{2\alpha }\left( 4x_{L}d_{L,n}\right) \end{aligned}$$

Lemma (3) states that for a large enough \(\alpha \) there exists a large \(\omega \) such that the equilibrium \(x_{L}\) is decreasing in \(d_{L,n}\). Thus, the main result of monotonicity holds: more moderate nominators choose lower charisma candidates. Using the implicit function theorem, and for a large enough \(\alpha \), the lemma holds. Let

$$\begin{aligned} E'\equiv \frac{\partial E(c_{L}|{x_{L}})}{\partial x_{L}}=\frac{(x_{L}-d_{L,n})}{2\sqrt{(x_{L}-d_{L,n})^{2}-t}}>0. \end{aligned}$$


For \(x_{L}>l\), in the FOC the ideology of the candidate is implicitly defined by

$$\begin{aligned} \left[ (x_{R}-d_{L,n})^{2}-(x_{L}-d_{L,n})^{2}\right] \left( \frac{\omega }{2\alpha }E'-\frac{x_{L}}{\alpha }\right) -2(x_{L}-d_{L,n})P_{L}=0. \end{aligned}$$

The SOC always holds \(\frac{\partial {\text {FOC}}}{\partial x_{L}}=-\frac{t}{2((x_{L}-d_{L,n})^{2}-t)^{3/2}}<0.\) Then, by the implicit function theorem, \(\frac{\partial x_{L}}{\partial d_{L,n}}<0\) if

$$\begin{aligned} \frac{2x_{L}}{\alpha }\left( \frac{\omega }{2}\frac{(x_{L}-d_{L,n})}{(x_{L}-d_{L,n})^{2}-t)^{1/2}}-2x_{l}\right) +1<0, \end{aligned}$$

and since \(0>x_{L}>l\), the inequality above holds for a large enough \(\omega \), given that \(\alpha \) is not too small; that is, for \(\alpha >4x_{L}^{2}\) . \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Casas, A. Ideological extremism and primaries. Econ Theory 69, 829–860 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Elections
  • Primaries
  • Polarization
  • Valence
  • Charisma
  • Open primaries
  • Closed primaries

JEL Classification

  • D02
  • D7
  • D72