1 Introduction

The last decades have seen a mounting distrust in political elites, increasingly perceived by citizens as distant and self-interested decision makers (Enikolopov and Zhuravskaya 2007; Treisman 2007; Fan et al. 2009). Reducing the institutional distance between rulers and citizens (i.e., the degree of centralization in the political decision-making process) is often considered a possible solution to regain the citizens’ confidence in politics. Federal constitutional systems and decentralized settings with greater expenditure and fiscal autonomy (Weingast 2009), or electoral rules favoring direct elections of public officials and small electoral districts (Persson and Tabellini 2003; Micozzi 2013) increase proximity of citizens to elected politicians. This would provide the citizens with more control power over policy actions and force politicians to more transparent and responsive behaviors, thus leading to less corruption (Arikan 2004; Ferraz and Finan 2011) and larger provision of public goods (Seabright 1996; Fisman and Gatti 2002; Hindriks and Lockwood 2009; Weingast 2009; De Janvry et al. 2012; Smart and Sturm 2013; Gradstein 2017).

The empirical evidence, however, does not fully support this view. Several studies found that the effect of decentralization reforms on public good provision is positive or negative, depending on the economic and institutional context (Rodden 2006; Goel et al. 2017; Bordignon et al. 2020; Gong et al. 2020).

To illustrate the ambiguous relation between public good and services provision and voter-politician institutional proximity, in Fig. 1 we consider the association between the number of hospital beds over population, a public good typically provided at the local level, and the number of sub-national government layers as a proxy for decentralization (Brennan and Buchanan 1980; Nelson 1986; Goel and Nelson 2011; Ivanyna and Shah 2014). To obtain Fig. 1, we regress per capita hospital beds on real GDP and country and year fixed effects in a sample of 31 OECD and non-OECD countries over years 1980–2012. Then, regression residuals are plotted against the number of sub-national government layers for low-corruption (panel A) and high-corruption (panel B) countries (where corruption is a measure of institutional quality). Figure 1 clearly depicts a positive relationship between institutional proximity and public goods provision in low-corruption countries (panel A) and a negative relation between proximity and public goods in high-corruption countries (panel B), suggesting that different institutional conditions may alter relevance and direction of the effects of citizens-politicians distance.

Fig. 1
figure 1

Per capita hospital beds and number of sub-national government layers. Note: In the two graphs residuals from the pooled OLS regression, \(Hospb_{it}=\gamma _0+\gamma _1GDP_{it}+\gamma _2 Country_{i}+\gamma _3 {Time_{t}}+\epsilon _{it}\), are plotted against CLOSE, where Hospb is the number of hospital beds (per 1000 people), GDP is real GDP in thousands US$ at constant 2005 national prices, Country and Time are country and time fixed effects, and CLOSE is the number of sub-national governments over resident population (per 100,000 people). Panel A considers the countries with values of the political corruption index below the sample median (low-corruption countries) and Panel B the countries with political corruption above the median level (high-corruption countries), where political corruption is measured by the index of public sector corruption constructed by Coppedge et al. (2015). Data cover the period 1980–2012. The list of countries includes: Albania, Australia, Austria, Belgium, Bulgaria, Canada, Croatia, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Japan, Lithuania, Netherlands, New Zealand, Norway, Poland, Portugal, Romania, Spain, Sweden, Switzerland, Turkey, United Kingdom and United States

To account for the ambiguous effect of institutional distance, we consider a political agency model augmented by the possibility that incumbent politicians take a costly action to distort public opinion.Footnote 1 This action is aimed at altering information on the true cost of public good, so that politicians can get a rent and still be re-elected. The optimal biasing action varies with institutional distance, citizens’ characteristics and other context factors, which affect its effectiveness. For instance, information-bias actions tend to be most effective in countries or regions where local media competence, integrity and independence from the political power are poor (Di Tella and Franceschelli 2011; Snyder and Strömberg 2010; Kendall et al. 2015; Dutta and Roy 2016; Ferguson et al. 2019) or where media competition for local news is lower than that for national news (Besley and Prat 2006).Footnote 2 In equilibrium, the provision of public good increases with the costs of biasing information and voters’ ability to correctly process information on the cost of public good. Instead, institutional distance has an ambiguous effect on public good provision, depending on the relationship between proximity and the effectiveness of the biasing action. For example, if local media are more effective watch dogs against incumbent politicians than national media, the distorting action becomes less effective as distance decreases, and the provision of public goods is enhanced by decentralization reforms or direct elections of public officials. By contrast, if local media are easily captured by local politicians, proximity brings about more effective biasing actions and public good provision decreases.Footnote 3

In terms of welfare, our model produces the known trade-off between selection and discipline (Besley and Smart 2007; Ashworth et al. 2017). Increasing proximity between voters and politicians might lead to a reduction in social welfare as rent-seeking politicians are more easily spotted and then they may prefer to grab the maximum possible rent today instead of trying to be re-elected.Footnote 4 In this setting, a reduced distance between citizens and politicians, politicians, even when implying greater accountability, does not always improve public goods provision and welfare. Indeed, since rent-seeking incumbent politicians are compelled to “behave”, and reduce rent extraction in order to be re-elected (discipline effect), running for re-election may become a very costly strategy, thus pushing the incumbents to grab the maximum possible rent while in office and forfeiting any reelection aspiration (selection effect).

Our main contribution is to characterize this trade-off, showing that the distance threshold for the discipline effect to prevail changes with the context parameters and in particular welfare-maximizing voter-politician distance depends on voters’ ability, cost and effectiveness of information biasing and the share of rent-seeking politicians in the jurisdiction.Footnote 5 It follows that optimal distance is generally different for different countries and regions. Therefore, decentralization reforms setting a uniform lower voter-politician distance for all jurisdictions may be welfare improving in some regions, and harmful in others.

Finally, we consider a mechanism that could counteract possible negative effects of proximity on welfare. We extend the previous model by explicitly considering the monetary compensation that politicians receive while in office, and allowing for the pool of politicians to be endogenously determined by the participation of citizens in the political activity. The salary paid to politicians increases the value of being in office for a second term, whence a salary threshold is shown to exist above which rent-seeking politicians, once in office, prefer a pooling strategy to mimic the benevolent ones rather than a separating strategy to extract the maximum rent. This salary threshold is higher, the lower the distance between citizens and politicians. In addition, both salary and distance contribute to determine the pool of politicians. Thus, the remuneration to political officeholders can be used to mitigate the possible welfare-decreasing effects of decentralization reforms in weak regions with low quality of local media and high share of bad politicians. In particular, paying relatively high salary to politicians in weak regions and low salary in strong regions may be welfare-maximizing. In this way, in weak regions a pooling equilibrium would dominate, and bad politicians find it profitable to provide enough public good to disguise themselves as benevolent. By contrast, in strong regions bad politicians, once in office, would be filtered out at the next election and most likely substituted by good politicians.

Our paper is closely related to the literature exploring the impact of information manipulation on public sector efficiency (Bordignon and Minelli 2001; Besley and Prat 2006), selection of politicians (Poutvaara and Takalo 2007; Bordignon et al. 2020), and the dilemma between selection and discipline (Besley and Smart 2007; Ashworth et al. 2017). In addition, we also relate to the literature on the effects of political decentralization reforms and electoral accountability on public services provision (Seabright 1996; Hindriks and Lockwood 2009; Boffa et al. 2016; Aslim and Neyapti 2017).

The paper is organized as follows. Section 2 presents our model. Section 3 describes the equilibrium, while in Sect. 4 the welfare analysis is developed. Section 5 discusses the results applied to the case of heterogeneous regions and the related policy implications. In Sect. 6, the analysis is extended to paid politicians. Section 7 concludes.

2 The Model

Consider a two-period economy, \(t = 1, 2\), where an election takes place at the end of period 1. The electoral competition is between the incumbent government and a challenger. In order to be elected, candidates must get the majority of votes. In each period, the government in office collects tax revenues \(T\ge 0\), which are exogenously determined and identical in the two periods, and provides an amount of public good \(G_t\ge 0\) at the unit cost \(\theta _t > 0\), which is randomly drawn by Nature from a probability distribution function (pdf) \(f_\Theta (\theta )\), identically and independently distributed in the two periods. The institutional setting forbids the possibility to run public deficits, such that \(\theta _tG_t \le T\) must hold in both periods.

There is a continuum of citizens of measure 1. In each period, citizens gain utility from the public good, net of the taxes they pay. Since the level of taxation is exogenously given, and the value of \(\theta \) is determined by Nature, citizens’ welfare can be written in terms of the amount of public good available in the two periods, \(SW=G_1+\delta G_2\), where \(\delta \in \left[ 0,1\right] \) is the discount factor. Therefore, citizens’ welfare in period t is maximized when the whole amount of tax revenue is used for the provision of public good:

$$\begin{aligned} G_t= \frac{T}{\theta _t}=G^{*}_{t} \end{aligned}$$
(1)

The amount of the public good provided by politicians is publicly observable, while the realization of the cost variable \(\theta _t\) in each period is private information of the incumbent government. Voters know the pdf \(f_\Theta \) and receive a noisy signal about the realization of \(\theta \), independent of the amount of public good \(G_t\). We assume that the signal \(s_i\) received by the voter i depends on the realization of unit cost \(\theta _t\), the voter individual attitude towards politicians \(\alpha _i\) and a possible aggregate information signal \(z\left( x,d\right) \) purposely generated by incumbent politicians to modify voters’ perceptions of how much the amount of public good supplied reflects production costs.Footnote 6 Specifically:

$$\begin{aligned} s_i= \theta _t\alpha _iz\left( x,d\right) . \end{aligned}$$
(2)

The term \(\alpha _i\ge 0\) captures the heterogeneity of voters’ political awareness and ability to interpret public information about government policies; it is assumed to be randomly distributed across voters with a pdf \(h_A(\alpha )\). The value \(\alpha _i\) is defined by the skeptic (\(\alpha _i < 1\)) or credulous (\(\alpha _i >1 \)) attitude of voters with respect to politicians and political life. Skeptical voters tend to systematically underestimate the true unit cost of public good, while credulous voters are inclined to overestimate the value of \(\theta \).

The aggregate bias \(z(x,d)\ge 1\) is affected by a factor \(d \in \left( 0,D\right) \) representing the institutional distance between voters and elected politicians and by the intensity \(x \ge 0\) of possible unobservable actions that incumbent politicians carry out to bias information available to citizens through media, with \(z_x(x, d)>0\), \(z_{xx}(x,d) \le 0\), \(z(0,d)=1\) for any d, and \(z_d (x,d) >0\) for any \(x >0\).Footnote 7 The unit cost for the incumbent politicians of taking biasing actions is \(c>0\).

The distance parameter d is the degree of centralization in the decision-making process or electoral rules.Footnote 8 More centralization (i.e., larger d) makes it harder for citizens to acquire reliable information about the true value of \(\theta \) and easier for the government to cheat (i.e., \(s_i\) increases for any given value of x). The action of distorting information by the incumbent government aims at affecting voters’ beliefs by inducing them to overestimate the true cost of public good, so as to have the chance to get a rent and still be re-elected. As shown in Sect. 3, optimal biasing action \(x^{*}\) depends on distance and other parameters, so that a change in d yields possibly ambiguous effects on \(s_i\). Finally, for simplicity, we assume that, whatever d is, if incumbent politicians do not take any action to bias public information on costs of public goods provision (i.e., if \(x=0\)), the signal that voters get from media depends only on their skeptic or credulous attitude towards politicians.

Voters know that the signal may be noisy. However, they do not know the pdf \(h_A(\alpha )\) and view themselves as aware citizens, neither systematically skeptical nor gullible, able to correctly interpret the information provided by politicians and be not influenced by possible biasing actions. Therefore, the belief of voter i about the cost of public good is exactly equal to the received signal \(s_i\). Unlike citizens, politicians know how \(\alpha \) is distributed in the population of voters, and in particular they know its median value \(\alpha _m\). The idea is that politicians are able to form an opinion fairly accurate as to what is the distribution of political awareness in the electorate, due to their continued political activity, recurrent opinion polls on political preferences, and participation in electoral campaigns.

Politicians are benevolent or rent-seekers, \(\pi =(b,r)\), in proportion \(\beta \) and \(1-\beta \), respectively. Benevolent politicians maximize citizens’ welfare when they are in office. By contrast, rent seekers maximize the expected discounted amount of tax revenues that they can divert in the two periods:

$$\begin{aligned} U^b&= G^{b}_1+ \delta G^{b}_2 \end{aligned}$$
(3)
$$\begin{aligned} U^r&= T-G^{r}_1-cx^{r}_1 + \delta (T-G^{r}_2-cx^{r}_2). \end{aligned}$$
(4)

Citizens know the value of \(\beta \), but cannot observe the type of incumbent politicians and challengers. The timeline of the political game is the following.

  • \(t=1\): the value of \(\theta _1\) is observed by the incumbent politician; she decides the intensity of biasing action x, and the level of public good \(G_1\); payoffs are realized;

  • \(t=2\): each voter i observes \(G_1\) and a signal \(s_i\) about the unit cost of production \(\theta _1\), forms an expectation about the incumbent politician’s type, and decides whether to re-elect the incumbent or vote for the challenger; the elected politician observes \(\theta _2\) and decides the amount of public good \(G_2\); payoffs are realized.

The set of strategy n-tuples of the incumbent politician is given by the possible public goods provided and the biasing actions carried out in each period \(\sigma ^{\pi }=\left( G^{\pi }_1,x^{\pi }_1,G^{\pi }_2,x^{\pi }_2\right) \) for \(\pi =\{b,r\}\). The set of strategies of the voters consists of the possible voting rules \(v_i\) establishing whether to vote for the incumbent \(v^I\) or the challenger \(v^C\), according to the observed amount of public good and the perceived signal \(s_i\) about the cost \(\theta _1\).

3 Equilibrium

We characterize the equilibrium as the set of strategies of the benevolent and rent-seeking incumbent politicians in period 1 and 2, which are best responses to, and consistent with voters’ beliefs about the cost of the public good and the type of the incumbent politician. Proceeding by backward induction, we first consider strategies and payoffs in the last office term.

In period 2, the dominant strategy of benevolent politicians is to provide the welfare-maximizing amount of public good, \(G^b_{2}=G^*_{2}=T/\theta _2\), and do not take any action to bias the information available to voters, \(x^b_{2}=0\). Rent-seeking politicians in office in period 2 also have a unique dominant strategy: they do not spend resources to bias information available to voters (i.e., \(x^r_{2}=0\)), and pocket all the tax revenues without providing any public good, (i.e., \(G^r_{2}=0\)).Footnote 9 Therefore, the amount of public good in period 2 is independent of d, and it is equal to the first best or zero according to whether a benevolent or a rent-seeking politician is in office.

At the beginning of period 2, each voter observes the actual \(G_1\) provided by the incumbent politician and the signal \(s_i\) about the production cost \(\theta _1\). Given this information set, voters express their electoral preferences and decide whether to vote for the incumbent or the challenger. Voting is not strategic, so that each citizen decides on the basis of her own payoffs without taking into account the voting patterns of other citizens. In addition, voting is purely retrospective and the challenger is drawn randomly from the pool of politicians at the date of election, such that screening challengers is impossible for voters.Footnote 10 For the sake of simplicity, we assume that voters adopt the following behavioral voting strategy.

Assumption 1

Footnote 11

$$\begin{aligned} v_i={\left\{ \begin{array}{ll} v^I \quad \text {(vote for the incumbent)} \quad \text {if} \quad G_1\ge {\hat{G}}_i= T/s_i \\ v^C \quad \text {(vote for the challenger)} \quad \text {if} \quad G_1 < {\hat{G}}_i= T/s_i.\\ \end{array}\right. } \end{aligned}$$

From Assumption 1, it follows that under the majority rule, the incumbent politician is re-elected with probability \(q=1\) or \(q=0\), according to whether \(G_1\) is greater than or equal to \({\hat{G}}_i\) for half of the population or not:

$$\begin{aligned} G_1 \ge {\hat{G}}_m =\frac{T}{s_m}=\frac{T}{\theta _1\alpha _mz\left( x,d\right) } \end{aligned}$$
(5)

where the subscript m identifies the median voter.

Condition (5) provides some interesting insights on the role of voters’ heterogeneity. In general, \(s_m\) may be greater, equal or lower than the actual unit cost of public good \(\theta _1\), depending on whether the median bias is greater or less than 1. Two broad scenarios may occur.

If the median voter is skeptical and \(\alpha _m <1\), the amount of public good needed for benevolent politicians to be re-elected is greater than the socially optimal amount: \({\hat{G}}_m>T/\theta _1\). As a result, given the impossibility of running a deficit, benevolent incumbents cannot be re-elected, unless they provide information to offset voters’ excess skepticism.Footnote 12 Only rent-seekers can be re-elected if x and d are such that \(\alpha _mz\left( x,d\right) >1\) and information biasing actions make it possible to extract some rents. Therefore, when \(\alpha _m <1\), election acts as an adverse selection mechanism and can only be a discipline device for incumbent rent-seekers.

When \(\alpha _m \ge 1\), the median voter is non skeptical and does not underestimate the production costs of public goods. In this case, the amount of public good needed to be re-elected is lower than the maximum feasible amount, leaving the possibility for both benevolent and rent-seeking politicians to be re-elected. In this case, election is a twofold device to select and discipline politicians. As this is the most interesting case, and nothing fundamental changes in the welfare analysis and policy discussion, in the rest of the paper we will focus on it. However, in Appendix B, we characterize equilibrium and social welfare for the case of skeptical median voters.

Assumption 2

\(\alpha _m \ge 1.\)

Under Assumption 2, a benevolent incumbent maximize her utility (3) by providing the social optimal level of public good \(G^{b}_1=T/\theta _1\) and spending no resources on influencing information available to voters, \(x^{b}_1=0\), and is always re-elected with \(q=1\).

By contrast, a rent-seeking incumbent has two strictly un-dominated strategies.

  1. (i)

    A “hit and run” strategy (henceforth, H-strategy), consisting in grabbing the maximum rent in period 1, taking no information biasing action \(x^{r,H}_1=0\) and providing \(G^{r,H}_1=0\). Since this strategy reveals that the politician in office is a rent-seeker, she is not re-elected and her payoff is:

    $$\begin{aligned} U^{r,H}=T. \end{aligned}$$
    (6)
  2. (ii)

    An “election” strategy (henceforth, E-strategy), consisting in providing the amount of public good needed to be re-elected, and then pocketing all tax revenues in period 2. Thus, the E-strategy implies \(x^{r,E}_1> 0\) and \(G^{r,E}_1={\hat{G}}_m\) in period 1, with a payoff equal to:

    $$\begin{aligned} U^{r,E}=\left( T-\theta _1{\hat{G}}_m-cx^{r,E}_1\right) +\delta T =\left( 1+\delta \right) T-\frac{T}{\alpha _mz\left( x^{r,E}_1,d\right) }-cx^{r,E}_1.\nonumber \\ \end{aligned}$$
    (7)

Equations (6) and (7) indicate that a rent-seeking incumbent faces a trade-off between getting the whole rent today but giving it up tomorrow, and foregoing some rent today in order to get the full rent in the second period. Since the information bias \(z\left( x,d\right) \) is non-convex in x, optimal bias is determined by maximizing the payoff under the E-strategy in Eq. (7). The first order condition equalling marginal benefit and marginal cost of the biasing activity is implicitly given by the value \(x^{r,E}_1=x^*\) that satisfies:

$$\begin{aligned} \frac{Tz_x\left( x^*,d\right) }{\alpha _m\left[ z\left( x^*,d\right) \right] ^2}=c. \end{aligned}$$
(8)

Given the second order condition \(\left( T/\alpha _m\right) \left( z_{xx}z-2z^2_{x}/z^3\right) <0\), then \(x^*\) is a global maximum for \(U^{r,E}\). Plugging \(x^*\) into (7), the best strategy for rent-seeking politicians is derived by comparing the maximum payoffs under E- and H-strategy. In particular:

$$\begin{aligned} \text {E-strategy} \succsim \text {H-strategy} \quad \Longleftrightarrow \quad z\left( x^*,d\right) \left( \delta T-cx^*\right) \ge \frac{T}{\alpha _m}. \end{aligned}$$
(9)

Proposition 1

Under Assumptions 1 and 2, a unique pooling or separating equilibrium exists, according to whether \(z\left( x^*,d\right) \left( \delta T-cx^*\right) \gtrless T/\alpha _m\). The optimal strategies for benevolent and rent-seeking incumbent politicians are:

Pooling equilibrium. At time 1, rent-seeking politicians provide \(G^{r,E}_1=T/ \left[ \theta _1\alpha _m z\left( x^*,d\right) \right] \) by putting into effect information biasing actions \(x^{r,E}_1=x^*\), and are re-elected with probability \(q=1\). At time 2, they provide \(G^{r,E}_2=0\) and \(x^{r,E}_2=0\). At \(t=1\) and \(t=2\) benevolent politicians provide the maximum feasible amount of public good \(G^{b}_t=T/\theta _t\), take no information biasing actions \(x^{b}_t=0\) and, if in office in the first term, are voted by the majority of citizens to a second term.

Separating equilibrium. At time 1, rent-seeking politicians provide \(G^{r,H}_1=0\), do not take actions to bias information available to voters, \(x^{r,H}_1=0\), and are voted out by citizens, \(q=0\). Benevolent politicians behave the same as under the pooling equilibrium.

As shown by Eq. (8), optimal information biasing depends on distance, voters’ awareness and the unit cost of the biasing activity. In particular:

Proposition 2

By applying the implicit function theorem to Eq. (8), \(\partial x^*/ \partial d \gtreqless 0\), \(\partial x^*/ \partial \alpha _m < 0\) and \(\partial x^*/ \partial c < 0\)

Proof

See Appendix C.1. \(\square \)

The effect of voter-politician distance on the optimal biasing action is uncertain, since it depends on how x and d interact in affecting the signal received by voters: if \( z_{xd}<2z_x z_d/z(x,d)\), then \(\partial x^*/\partial d<0;\) otherwise, \(\partial x^*/\partial d>0\). The simple intuition is that if the marginal effectiveness of the biasing action decreases or weakly increases with d (for example, because local media are less effective in checking and balancing political power or they are more easily manipulated by local politicians) the intensity of the biasing action and resources spent on information biasing by rent-seeking politicians are reduced in more centralized political system. On the other hand, when the marginal effectiveness of biasing actions strongly increases with the distance, higher centralization leads incumbent rent-seekers to increase their efforts to bias information available to voters. Concerning \(\alpha _m\) and c, when voters are more politically unaware, rent-seeking politicians have less need to bias information in order to extract private rents from taxes and be re-elected. Similarly, if biasing actions are more expensive, the optimal value \(x^*\) unambiguously drops.

In a similar way the impact of distance, voters’ political unawareness and biasing costs on \({\hat{G}}_m\) may be worked out by differentiating Eq. (5).

Proposition 3

From Eq. (5), \(\partial {{\hat{G}}}_m/ \partial d \gtreqless 0\), \(\partial {{\hat{G}}}_m/ \partial \alpha _m < 0\) and \(\partial {{\hat{G}}}_m/ \partial c > 0\)

Proof

See Appendix C.2. \(\square \)

An increase of voters’ political unawareness and a decrease of information biasing costs unambiguously decrease the provision of public goods by rent-seeking incumbents aiming at being re-elected. By contrast, conditional on choosing the E-strategy, the amount of public good supplied by a rent-seeking incumbent is decreasing with the voter-politician distance only if \(z_{x,d} > z_{xx}z_d/z_x\), that is, if the marginal effectiveness of the biasing action does not strongly decrease with d. Otherwise, if the quality and independence of media is poorer at the local than the central level, such that \(z_{x,d} \le z_{xx}z_d/z_x\), in decentralized political systems bad politicians have the opportunity to bias information to voters, neutralize the greater accountability produced by voters’ proximity, and supply a smaller amount of public good.

Now, we can derive how the optimal strategy of bad incumbents is affected by distance:

Proposition 4

For any given value of \(\alpha _m\) and c, a unique distance \({\hat{d}}\left( \alpha _m, c\right) \) exists, such that for \(d<{\hat{d}}\) rent-seeking politicians prefer the H-strategy, and a separating equilibrium prevails, while for \(d\ge {\hat{d}}\) the E-strategy is preferred and a pooling equilibrium prevails. The threshold \({\hat{d}}\) decreases with voters’ political unawareness \(\alpha _m\), and increases with the cost of biasing information c.

Proof

Se Appendix C.3. \(\square \)

The intuition behind Proposition 4 is the following. Rent-seeking incumbents can disguise themselves as benevolent politicians more easily the larger the distance from voters and the political unawareness of the latter, and the less costly is the action to bias information about the real costs of public goods provision. The more difficult cheating, the higher the threshold \({\hat{d}}\). Below the threshold \({\hat{d}}\), re-election would require to provide an amount of public goods so large that bad politicians find more profitable not mimicking benevolent politicians and grabbing all the taxes in the first electoral term.

Proposition 4 is illustrated in Fig. 2, which displays the relationship between distance and the amount of public good provided by a rent-seeking incumbent.

Fig. 2
figure 2

Distance and public good provision by rent-seeking politicians

In Panel A we assume that the marginal effectiveness of the biasing-information action increases or slightly decreases with distance (i.e., \(z_{xd}>z_{xx}z_{d}/z_{x}\) for any d). In this case, the amount of public good supplied by rent-seeking incumbents is zero for \(d<{\hat{d}}\) when H-strategy is chosen, while it is positive and monotonically decreasing for \(d\ge {\hat{d}}\) and E-strategy is chosen. At \(d={\hat{d}}\), public goods provided by rent-seeking politicians reach the maximum value \({\tilde{G}}^r\). Further increases in distance lead to reducing the provision of public goods because less and less information is available to voters, and for rent seekers passing themselves off as benevolent politicians is easier.

The dashed line displays comparative statics with respect to \(\alpha _m\) and c. An increase in the political unawareness of median voter and a decrease in biasing costs bring about a decrease in \({\hat{G}}_m\), so that the sloped portion of the curve shifts downwards. Moreover, \(\Delta U\) increases (see Eqs. (C.8) and (C.9)), and therefore threshold \({\hat{d}}\) is lower. Note that in the graph, the value \({\tilde{G}}^r \) at the new threshold is placed below the value at the initial threshold, but it may also be above or at the same level as the latter. Intuitively, more unaware voters allow rent-seeking incumbents to be re-elected by providing a lower amount of public good, and this makes the E-strategy more rewarding even at lower distances. By contrast, an increase in the unit cost of biasing reduces the gain from re-election and narrows the set of values d for which the E-strategy is optimal.

In Panel B, we consider the case in which the marginal effectiveness of information biasing decreases with voter-politician distance (\(z_{xd}\le z_{xx}z_{d}/z_{x}\)). Once again, there is a threshold \({\hat{d}}\) under which rent-seeking incumbents prefer to play H-strategy, while for \(d\ge {\hat{d}}\), the preferred strategy is ”election”. In this case, however, as the voter-politician distance increases, the optimal biasing action \(x^*\) decreases so strongly as to compensate for the negative effects of distance and improve the transparency of information available to voters. In addition, the reduction of expenses for information biasing allows the incumbent to afford to provide the larger amount of public good needed to be re-elected. Therefore, the provision of public goods by rent-seeking politicians increases with distance from voters and reaches its maximum at the highest possible degree of political centralization (i.e., at \( d = D \)). Comparative statics is similar to that shown in Panel A.

4 Welfare

In this section, we determine what is the voter-politician distance set by a constitutional legislator who maximizes the expected welfare of citizens. Some degree of centralization may be recommended to induce rent-seeking politicians to supply a positive amount of public good, since too much proximity would prompt them to adopt H-strategy.

Since the threshold \({\hat{d}}\) does not vary with the realization of \(\theta \), citizens’ expected welfare may be written as:

$$\begin{aligned} SW={\left\{ \begin{array}{ll} SW^H=\beta \left( 1+\delta \right) E\left( \frac{T}{\theta }\right) +\left( 1-\beta \right) \beta \delta E\left( \frac{T}{\theta }\right) &{}\quad \text {if} \quad d<{\hat{d}} \\ SW^E=\beta \left( 1+\delta \right) E\left( \frac{T}{\theta }\right) +\left( 1-\beta \right) E\left( \frac{T}{\theta \alpha _mz\left( x^*\left( d\right) ,d\right) }\right) &{} \quad \text {if} \quad d\ge {\hat{d}}\\ \end{array}\right. } \end{aligned}$$
(10)

The first term on the right-hand-side of (10) refers to the payoff obtained when a benevolent politician is in office in period 1, \(\left( 1+\delta \right) E\left( T/\theta \right) \), which happens with probability \(\beta \). The second term shows the possible payoffs if in period 1 a rent-seeking incumbent is in office, which happens with probability \(\left( 1-\beta \right) \). In this case, if the distance is lower than \({\hat{d}}\), a separating equilibrium prevails, in which the incumbent plays the H-strategy and the expected payoff of citizens is \(\beta \delta E\left( T/\theta \right) \), that is the probability of incurring in a benevolent government in period 2 times the discounted expected amount of public good supplied by a benevolent politician. If the distance is large enough, \(d\ge {\hat{d}}\), the equilibrium is pooling: rent-seeking incumbents play the E-strategy, and the expected payoff of citizens is \(E\left( \frac{T}{\theta \alpha _mz\left( x^*\left( d\right) ,d\right) }\right) \).

The second term of (10) clearly shows the trade-off between the effects of politicians’ discipline and selection connected to the choice of distance. In the top line, it is represented the gain from the selection effect, which is the amount of public good expected for the second period thanks to the possible election of a benevolent politician. In the bottom line, the second term accounts for the welfare gain due to the discipline effect of elections, pushing the rent-seeking incumbent to provide a positive amount of public good. Since the first addend of SW does not depend on d, the optimal distance is obtained by comparing the highest possible expected value of public goods provided by a rent-seeking incumbent in period 1 with the expected value of public goods obtainable by voting for the challenger, \(\beta \delta E\left( T/\theta \right) \).

Proposition 5

Let \({\tilde{d}}\ge {\hat{d}}\) be the value of d such that \(E\left( \frac{T}{\theta \alpha _mz\left( x^*\left( d\right) ,d\right) }\right) \) is maximum. If \(\beta \delta E\left( \frac{T}{\theta }\right) >E\left( \frac{T}{\theta \alpha _mz\left( x^*\left( {\tilde{d}}\right) , {\tilde{d}}\right) }\right) \), the expected rewards from selection are greater than the expected rewards from discipline and decentralization is optimal, that is \({\hat{d}}>d^* \ge 0\).Footnote 13 If \(\beta \delta E\left( \frac{T}{\theta }\right) \le E\left( \frac{T}{\theta \alpha _mz\left( x^*\left( {\tilde{d}}\right) ,{\tilde{d}}\right) }\right) \), the expected rewards from selection are not greater than the expected rewards from discipline, then: (i) when \(z_{xd}>z_{xx}z_{d}/z_{x}\), moderate centralization is optimal, \(d^*={\tilde{d}}= {\hat{d}}\); (ii) when \(z_{xd}\le z_{xx}z_{d}/z_{x}\), maximum centralization is optimal, \(d^*={\tilde{d}}= D\).

Proof

From Proposition 4 and Eq. (8) the threshold \({\hat{d}}\) and the optimal biasing \(x^*\) are both not dependent on the realization of \(\theta \), while from (C.4) in Appendix C the amount of public good provided by rent-seeking politicians under pooling equilibrium is strictly decreasing or increasing with d according to whether \(z_{xd}\) is strictly greater than \(z_{xx}z_{d}/z_{x}\) or lower or equal than that. Therefore, all the bullets in Proposition 5 follow straightforwardly. \(\square \)

Proposition 5 clearly shows that institutional factors, as captured by the values of \(\beta \), \(\alpha _m\), c and \(z_{xd}\), are key determinants of optimal voter-politician distance. In particular, hinging on Eqs. (C.4)–(C.9) and Propositions 4 and 5, we can state:

Proposition 6

Decentralization is more likely to be an optimal setting: (i) the larger the share of benevolent politicians \(\beta \), and higher the expected rewards from selection; (ii) the larger the political unawareness of voters \(\alpha _m\); (iii) the smaller the unit cost of biasing information c.

5 Heterogeneous Regions

As shown in the previous sections, the relation between voter-politician distance, public goods provision and social welfare is not univocal, depending on the quality of the institutional context. So far, we have taken the perspective of a single administrative unit, or implicitly assumed that \(\beta \), \(\alpha _{m}\) and c are the same across different administrative units. In fact, the quality of institutional factors can be heterogeneous across regions in a country. Let us assume that these parameters are region specific, and that \(\beta _\mu \), \(\alpha _{m\mu }\) and \(c_\mu \) are the average values of the share of good politicians, political unawareness of median voters, and cost of information biasing.

Consider politicians’ quality. A high \(\beta _\mu \) makes a separating equilibrium socially more rewarding than a pooling equilibrium and pushes the constitutional legislator to choose a more decentralized setting, inducing bad politicians to reveal themselves as rent-seekers and exploiting the high probability of challengers being benevolent. However, a reduction of voter-politician distance through a decentralization reform may be harmful for regions populated by a low share of good politicians (lower than \(\beta _\mu \)), as it involves forgoing gains from discipline without obtaining much from selection.

A larger average unawareness of voters \(\alpha _{m\mu }\) has a negative impact on social welfare for \(d\ge {\hat{d}}\) and no effect for \(d < {\hat{d}}\). This because voters’ unawareness permits the bad politicians who play the E-strategy to supply less public goods. This leads constitutional legislator to prefer more decentralized political systems. However, while a lower distance benefits regions populated by relatively highly credulous voters, where rent-seeking politicians are very likely to aim for a second term by playing the E-strategy, it can damage areas with realistic voters, where bad politicians have less incentive to pool with good politicians and are more prone to switch from E- to H-strategy.

Similarly, lower value of \(c_\mu \) decreases social welfare when \(d\ge {\hat{d}}\), by leading rent-seeking politicians to spend more resources on information biasing, and makes decentralization more rewarding. However, lower d can lead rent-seeking politicians in regions with relatively high biasing costs to play the H-strategy and provide zero public goods. In this case, decentralization reforms might favor some administrative units (the ones with low c) and penalize others.

Finally, suppose that the effectiveness of information biasing is different across regions, and in particular that \(z_{xd}\) is larger than \(z_{xd}z_{d}/z_{x}\) in good regions and lower than that in the bad ones. It follows that a reduction in distance is likely to be beneficial for the former regions (as long as welfare is decreasing in distance) and harmful for the latter (where the relationship between distance and welfare is reversed).

Figure 3 illustrates the above discussion on the effects of decentralization with heterogeneous regions, focusing on a two-region case in which diversity arises in either the quality of politicians \(\beta \) (panel A) or the effectiveness of information biasing \(z_{xd}\) (panel B). The horizontal line drawn for \(d < {\hat{d}}\) displays the social welfare in the case a separating equilibrium occurs (i.e. when low distance triggers H-strategy by rent-seeking politicians). For \(d \ge {\hat{d}}\), social welfare is decreasing or increasing in the distance according to the value of \(z_{xd}\).

In Panel A, the relationship between distance and social welfare is depicted for regions L and S. Region L has a large share of good politicians, compared to the average share of benevolent politicians in the country, i.e. the value used by a welfare maximizing constitutional legislator to decide on the optimal distance. Region S is largely populated by bad politicians, so that \(\beta _L>\beta _\mu >\beta _S\). In the case considered in the figure, \(SW^H\left( \beta _\mu \right) >SW^E\left( \beta _\mu , {\hat{d}}\right) \), and therefore the distance maximizing the average social welfare is lower than \({\hat{d}}\). However, optimal distance is different in the two regions: lower than \({\hat{d}}\) for the large-\(\beta \) region L, and equal to \({\hat{d}}\) for the small-\(\beta \) region S. Therefore, a decentralization reform maximizing the average social welfare increases the welfare of citizens in region L from \(SW^E\left( \beta _L, {\hat{d}}\right) \) to \(SW^H\left( \beta _L\right) \), and decreases the welfare of citizens in region S from \(SW^E\left( \beta _S, {\hat{d}}\right) \) to \(SW^H\left( \beta _S\right) \).

Fig. 3
figure 3

Social welfare and voter-politician distance in heterogeneous regions

In panel B, the quality of politicians is assumed to be equal across regions (\(\beta _L=\beta _S=\beta _\mu \)), so that \(SW_H\) is equal as well, while the effectiveness of information biasing \(z_{xd}\) is strictly different. For the sake of simplicity, a unique threshold \({\hat{d}}\) is assumed to hold for both regions.Footnote 14 Again, an average-welfare maximizing policy, considering the dotted curve and choosing \(d^*<{\hat{d}}\) would be profitable for the high-\(z_{xd}\) region (where \(SW_E\) is decreasing in distance) and detrimental for the low-\(z_{xd}\) region (where \(SW_E\) is increasing).

Summarizing, our analysis makes it clear that constitutions choosing a unique level of voter-politician distance based on average values of institutional parameters in the country may fail to be optimal for all the regions. From a policy viewpoint, this result suggests that the optimal design of local/central governance rules should take into account the heterogeneity of regions and be accompanied by measures that can help make the regions more similar to each other in terms, for example, of quality of local media and ruling class. In the next section, we extend the analysis by considering the salary paid to politicians as a possible device to counterbalance possible negative effects of voter-politician distance on weak regions.

6 Endogenous Pool of Politicians

The pool of politicians is one of the elements determining the welfare implications of proximity. In this section we investigate the incentive of citizens to participate in the political process by running for elections. There are monetary and non-monetary rewards from being elected. In particular, we assume that rent-seeking politicians are interested only in monetary incentives while benevolent ones receive gratification from doing their social duties and contributing to the welfare of their community.Footnote 15 In this context, the salary paid to elected politicians has an impact on the choice of citizens to stand for election or not as well as on the choice of officeholders to provide public goods or grab taxes, which, under certain circumstances, might offset the negative effect of voters’ proximity in weak regions.

6.1 Public Good Provision

Let us assume that elected politicians are remunerated with a fixed salary W, independent of their ability or performance. The salary is paid out of taxes T, such that the amount of resources available to produce the public good is \(\left( T-W\right) \). Beside the salary, benevolent politicians reap a non-monetary payoff \(B>0\) from doing their duty as a civil servant.Footnote 16 For rent-seeking politicians \(B=0\). Also, assume that the cost of public good is independent of the incumbent’s ability and that legal and administrative controls constrain the rent diversion to be not larger than \(\left( T-W-\tau \right) \), where higher values of \(\tau \) indicate more effective control. The utility of the incumbent politician is:

$$\begin{aligned} U^b&= B+W \end{aligned}$$
(11)
$$\begin{aligned} U^{r,H}&= T-\tau \end{aligned}$$
(12)
$$\begin{aligned} U^{r,E}&= T\left( 1+ \delta \right) -\delta \tau -\frac{T-W}{\alpha _mz\left( x,d\right) }-cx \end{aligned}$$
(13)

according to whether she is benevolent or a rent-seeker playing H- or E-strategy, respectively.

Proposition 7

(i) The optimal intensity of biasing action \(x^*\) is decreasing with the salary W paid to elected politicians. (ii) A salary \({\hat{W}}\) exists such that: for \(W \ge {\hat{W}}\), rent-seeking incumbents provide the amount of public good \(G^{r,E}_1= {\hat{G}}_m\left( W\right) =\left( T-W\right) /\theta _1\alpha _mz\left( x^*,d\right) \) in order to be re-elected for a second term; for \(W < {\hat{W}}\) they prefer to divert the maximum rent \(\left( T-W-\tau \right) \) and leave the office. (iii) \({\hat{W}}\) decreases with d and \(\tau \), and \({\hat{G}}_m\) decreases with W.

Proof

See Appendix C.4\(\square \)

Like voter-politician distance, paying a salary to incumbent politicians has the effect of increasing the value of holding office for a second term. Therefore, in decentralized political systems, a higher compensation is required to motivate rent-seeking politicians to provide enough public goods to citizens. On the other hand, unlike distance, a higher salary unambiguously reduces both outlays to bias information and the amount of public goods supplied by benevolent and rent-seeking politicians, because funds available for both public goods and rents diminish.

6.2 Social Welfare

Introducing a politicians’ remuneration affects social welfare by both draining public resources and modifying the pool of candidates to election. Assume that there are two possible types of individuals: if elected, some behave congruently with voters’ welfare and others dissonantly. Let the mass of congruent and dissonant individuals in the economy be the same, normalized to 1. Each individual can earn a monetary income in the private sector equal to \(wa_i\), where \(a_i \sim U\left[ 0,1\right] \) is individual ability, uniformly distributed in the unit interval. Since the elected candidate has to give up the outside income, congruent individuals run for election if \(a_i \le \left( B+W\right) /w\), while dissonant individuals do it if \(a_i \le U^{r,H}/w\) or \(a_i \le U^{r,E}/w\), according to whether \(W\lessgtr {\hat{W}}\). Therefore, assuming, as in the previous sections, that the incumbent and the winning candidate in the second-term are random draws from the pool of people running for election, the probability of having a benevolent politician in office is:

$$\begin{aligned} \wp ={\left\{ \begin{array}{ll} \wp _H=\frac{B+W}{B+W+T-\tau }&{}\quad \text {if} \quad W<{\hat{W}} \\ \wp _E=\frac{B+W}{B+W+U^{r,E}}&{} \quad \text {if} \quad W\ge {\hat{W}}\\ \end{array}\right. } \end{aligned}$$
(14)

Substituting into (10), the social welfare becomes:

$$\begin{aligned} SW = {\left\{ \begin{array}{ll} \!\begin{aligned}(b) SW^H &{}= \wp _H\left( 1+\delta \right) E\left( \frac{T-W}{\theta }\right) + \\ &{}\quad + \left( 1-\wp _H\right) \left[ E\left( \frac{\tau }{\theta }\right) +\wp _H\delta E\left( \frac{T-W}{\theta }\right) +\left( 1-\wp _H\right) \delta E\left( \frac{\tau }{\theta }\right) \right] &{} \text {if} \quad W<{\hat{W}} \\ SW^E &{} = \wp _E\left( 1+\delta \right) E\left( \frac{T-W}{\theta }\right) + \\ &{}\quad + \left( 1-\wp _E\right) \left[ E\left( \frac{T-W}{\theta \alpha _mz\left( x^*, d\right) }\right) +\delta E\left( \frac{\tau }{\theta }\right) \right] &{} \text {if} \quad W\ge {\hat{W}} \end{aligned} \end{array}\right. } \end{aligned}$$
(15)

From (15), when \(W<{\hat{W}}\), a higher salary improves the pool of candidates, and \(SW^H\) reaches the maximum value at a salary equal or lower the threshold, \(W^{*}_H\le {\hat{W}}\). By contrast, when \(W\ge {\hat{W}}\), a further increase of salary paid to politicians can increase or decrease the probability of having a benevolent politician in office, according to whether \(U^{r,E}-\left( B+W\right) \frac{\partial U^{r,E}\left( x^*\right) }{\partial W} \gtrless 0\).

Therefore, if B and/or \(\frac{\partial U^{r,E}\left( x^*\right) }{\partial W}\) are sufficiently large, an increase in salary spurs dissonant more than congruent citizens to enter the pool of candidates, implying that social welfare under the pooling equilibrium \(SW^E\) can be constantly decreasing with W. Otherwise, SW would increase in W up to the point where the resources available for the production of the public good do not decrease enough to reverse the relationship (i.e., \({\hat{W}}\le W^{*}_E<T-\tau \)).

6.3 Multiple Regions

As in Sect. 5, let us consider different administrative units with different values of parameters. The optimal salary that maximizes the stepwise function (15) of the average region for any given distance is \(W^{*}_H\left( d,\tau _\mu , c_\mu ,\alpha _{m,\mu }\right) \) or \(W^{*}_E\left( d,\tau _\mu , c_\mu ,\alpha _{m,\mu }\right) \), according to whether \(SW^H\left( W^{*}_H\right) \gtrless SW^E\left( W^{*}_E\right) \). If politicians can be paid differently across regional governments, salary can be used to mitigate the possible negative effects produced by a decentralization reform on weak regions.

Consider two regions L and S, the former characterized by a great difficulty in diverting public resources and the latter where rent-seeking incumbents can easily grab taxes undisclosed (i.e., \(\tau _L >\tau _S\)). From Eq. (14), region L can count on a better pool of candidates under both the separating and pooling equilibrium. If the average amount of public goods provided by benevolent politicians is greater than the minimum amount that rent-seeking politicians are forced to provide—that is, if \(E\left[ \left( T-W\right) /\theta \right] > E\left( \tau /\theta \right) \))—then the social welfare in region L comes out to be higher in any equilibrium. Also, since a higher \(\tau \) involves a stronger reduction in payoffs for H-strategy than for the E-strategy, the threshold salary in region L is lower, i.e. \({\hat{W}}_L<{\hat{W}}_S\). As a result, welfare-maximizing salaries are in general different in the two regions, and it can be optimal paying politicians more in weak S-regions than in well-functioning L-regions, \(W^{*}_S>W^{*}_L\).

In Fig. 4 social welfare is displayed as a function of politicians’ wage for a given distance. In region S bad politicians may divert taxes more easily, and pooling equilibrium with a wage \(W^*_{S}>{\hat{W}}_{S}\) is socially preferable to separating equilibrium.The opposite happens in region L, where diverting resources from the production of public goods is difficult, and \(W^*_{L}={\hat{W}}_{L}\). In the latter case, the high average quality of the pool of candidates participating in the electoral competition makes the selection effect of election socially more valuable than the discipline effect of high salaries, and the separating equilibrium is welfare-optimal. By allowing for different compensations to elected politicians in the two regions, \(W^*_{S}>W^*_{L}\), the legislator can maximize social welfare by realizing the separating equilibrium in the region L and the pooling equilibrium in the region S.

Fig. 4
figure 4

Social welfare and politicians’ wage

7 Conclusions

Decentralization institutional reforms are often considered an effective way to increase political accountability, thus increasing social welfare. This is not always supported by the empirical evidence, however. In this paper, we showed that the relationship between distance and public goods provision is non monotonic, and that the welfare effect of proximity depends on the share of good and bad politicians, the voters’ political awareness, and the cost and effectiveness of information biasing activities by the incumbent politician. Since these institutional features may vary across regions within the same country, a decentralization reform may be welfare improving for some region but detrimental for others. This is a risk in countries with large regional disparities.

A policy implication of our analysis is that moderate or strong centralization of the political system should be recommended in some cases to select benevolent politicians or even to induce rent-seeking politicians to supply a larger amount of public good. In countries characterized by strong institutional disparities across regions, constitutional reforms aimed at increasing decentralization may result to be profitable for advanced regions and detrimental for the weak ones.