Democracy, Urbanization, and Tax Revenue

During the last two centuries, taxation has not only increased dramatically in level and volume; its structure has also changed: from a heavy reliance on customs revenue in the early nineteenth century to a stronger emphasis on income taxation in the twentieth. A common explanation for this development is the spread of democracy, which supposedly increases redistribution and the size of government. This paper argues that the effect of democratization on taxation depends on the distribution of tax preferences in society. These preferences are not uniform: rural farmers prefer different policies than urban workers. Thus, the impact of democratization varies depending on the urbanization rate. The paper uses a novel dataset providing data on government tax revenue in thirty-one countries in Western Europe, the Americas, Australia, New Zealand, and Japan—from as far back as 1800 up to the present day—in order to evaluate the conditional impact of democratization on tax structure. The results show that democracy decreases property taxes in rural countries but instead increases income taxes and decreases excise and consumption taxes in more urbanized states. These results are robust to different estimation methods, a number of control variables, such as interstate warfare, and to alternative measurements of democracy.

In 1850, the government share of the economy was not even 6 %; one hundred years later, it had almost tripled. Not only did the last 200 years see an enormous increase in tax revenue; the composition changed radically. 1 Tariffs went from being the most important source of tax revenue in 1880 to being almost insignificant a century later, while income tax became a major part of government revenues during the same period.
I argue that these radical shifts in tax structure can be explained by changes in political institutions and economic structures. Specifically, democratization has a substantial impact on taxation, but its effect depends on the preferences of the formerly disenfranchised, which differs between urban and rural sectors. By matching class interest with specific taxes, I posit that the effect of democracy is conditional on underlying political preferences. To my knowledge, no previous research on the topic uses data stretching over 200 years and across 31 nations. The results show that democracy decreases property taxes in rural countries and increases income taxes and decreases excise and consumption taxes in highly urbanized states.
In the period 1800 onward, we not only observe an increase in the size of government, but also a change in the type of government. The number of democracies among the 31 countries in the sample studied in this paper grew from three in 1850 to 27 in 2000 (using the definition in Boix et al. (2012)). One influential theory points to the issue of inequality and argues that since democratization increases the influence of the poor, more redistribution should follow. Although this paper is less concerned with redistribution as such and focuses on tax revenues, the fact that political reform allowed participation from previously disenfranchised groups is important when explaining the evolution of tax systems. In my argument, democracy allows previously suppressed tax preferences to be heard, and as a consequence to affect policy. These preferences are in part a function of urbanization.
Another striking development during the last two centuries is the fundamental economic changes brought about by industrialization. In many places, this led not only to increased economic growth but also to changes in the geographical distribution of the population. The proportion of the population living in cities of 20,000 or more inhabitants rose from around 10 % in the middle of the nineteenth century to over 30 % in 1935 (Banks and Wilson 2012).
The timing of democratization in relation to how urbanized a country is matters since tax preferences diverge between rural and urban voters. The previously disenfranchised groups, for whom democracy grants influence over (among other things) tax policy, are different in a mainly rural agrarian society than in an industrial urbanized country. Rural farmers, urban manufacturing employees, and the old elite had different interests, not only in how public revenue was spent, but also in how it was generated. While an urban worker prefers to shift taxation from consumption onto property and income, a rural farmer is mainly concerned with lowering taxes on land. In this way, the impact of democracy on tax structure depends on urbanization.
Although most of earlier research has focused on the adoption of income tax or taxation as a share of the economy, there are exceptions. Timmons (2010a) considers the determinants of tax structure in 100 countries between 1970 and 1999 and finds that democratization tends to increase taxes on consumption but not on income. With a more historical focus, Aidt and Jensen (2009a) find that franchise extension leads to a higher share of direct taxation, but only when the collection costs are low. However, they both fail to appreciate the impact of preferences over taxation and how these are related to economic development. Urbanization is treated like a mere control variable and neither of the two considers the possibility that urban and rural citizens have different tax preferences and this in turn means that democracy can have a different impact on different circumstances. I argue that we cannot understand the effect of democracy on taxation without considering the preferences of previously excluded groups.
The next section develops the theoretical argument. After that follows a summary of two main types of explanation of historical developments in tax structure; the first emphasizing the impact of political participation and the second interstate warfare. The subsequent section presents the data and empirical strategy. Finally, results and robustness analyses are presented followed by a short discussion and conclusion.

Preferences and Representation
My argument is based on the "classical" political science treatment of tax systems, where demand for redistribution is seen the main determinant (Alt 1983;Kau and Rubin 2002). In theory, enfranchisement of the poor would lead to more redistribution as long as the income of the median voter is lower than the mean (Romer 1975;Meltzer and Richard 1981). Thus, democracy in itself should lead to a progressive tax and transfer system, effecting reduced inequality (Lee 2005). While it is vital to take both revenues and expenditure into account when considering redistribution, it is important to note that the levels of social spending were very low before the twentieth century ( (Lindert 2004a): 20). Thus, for many countries, the revenue side had real distributive effects, and in these situations, we can interpret the democracy and redistribution theory in terms of progressive versus regressive taxes. This means democracy is expected to have a positive impact on progressive taxes (e.g., income taxation) and to be negatively related to regressive taxes (such as excise and consumption taxes). Considering the revenue side independently is important in light of the inconclusive evidence of a link between democracy and redistributive spending (see Lindert (2004b) for evidence in support of such a link and Ansell and Samuels (2010) for opposing evidence).
Another drawback of the redistributive framework is that it does not take the geographic distribution of voters into account. The newly enfranchised have different preferences over taxation: the urban poor have different preferences than the rural poor. Thus, the effect of suffrage extensions (and democratization more generally) is contingent on who the poor are.
The urban-rural cleavage was important both before and during industrialization (although the importance and expression of these conflicts differed between countries) (Lipset and Rokkan 1967). As a country became more industrialized, class overtook the urban-rural divide as the most salient conflict. This new group of low-income manufacturing workers had different preferences regarding tax policy, for example, a progressive income tax was more attractive than excise and consumption taxes. Moreover, a concentration of this group in cities facilitated political mobilization (Rodden 2011).
Distributional conflict between peasants and urbanites was common in the nineteenth century and these conflicts were linked to political representation ( (Baldwin 1990):63). For example, in 1891, Denmark introduced an inclusive pension system financed by taxes on consumption. Farmers supported this scheme since the taxes paying for it fell primarily on urban consumers (ibid.). Moreover, the literature on the politics of agricultural protection has shown that not only is the urban/rural divide important, it is also related to democratic reforms (Olper et al. 2014;Swinnen 2009;Thomson 2016).
Rising urbanization shifts tax preferences for the population as a whole, but its effect on politics is conditional upon effective representation. If a new group (e.g., urban workers) with different taxation preferences is excluded from politics, the impact on policy is weak (unless there is a credible threat of revolution (Acemoglu and Robinson 2001) or unrest (Thomson 2016)). Democratization in a predominately rural society means a stronger influence of the rural poor while democratization in an urban country means a stronger influence of the urban poor. Since these two groups have different preferences over taxation, the impact of democratization is contingent on the urbanization rate. On the one hand, if the poor are urban manufacturing employees, we expect democratization to reduce taxes on consumption while increasing progressive taxes such as income tax. On the other hand, if the poor are mainly rural farmers suffering from heavy land taxation, we would expect democratization to first and foremost lower taxes on land and shift taxation onto other groups (such as urbanites). When considering the effect of increased political participation, we must have an idea of who the formerly excluded are and have information about their preferences. Urbanization provides a measure of the balance between urban and rural citizens, allowing the formulation of conditional hypo theses regarding the impact of democracy.
Before elaborating more on the preferences of urban and rural voters, two caveats are needed. I am implicitly assuming that political institutions have an impact on taxation and not the other way around. But this is not necessarily the case since taxation can lead to popular mobilization in support of democracy and thus turn the causal arrow in the opposite direction (see for example Bates and Lien (1985), Moore (2004), Ross (2004), andHerb (2005)). It is also important to note that this paper only concerns taxation and does not intend to answer questions about redistribution since this also involves government expenditures.

Rural Poor
There is a growing literature on the politics of agricultural protection exploring how the preferences of voters in this sector affects policy. While the agricultural sector is not always united in trade preferences-for example, grain farmers and live-stock producers have different preferences over import restrictions (Swinnen 2009)-their preferences regarding land taxation should be the same. That is, regardless in what type of agriculture one engages in, lower taxes on land is always preferred. An example of this is from the 1912 manifesto of the Swedish Farmer's League party where it argues for "Preventing any the slightest attempt to transfer taxes on the soil..." (my translation) (Bondeförbundets program (Farmer's League's Program) 1912). 2 Reducing land taxes was also a major concern for peasants in nineteenth century France ( (Weber 1976) as cited in Morgan and Prasad (2009)). More recent evidence suggests that democratization in countries with a large agricultural sector leads to lower taxes on this sector (Olper et al. 2014) 3 and that democratization in highly urbanized states leads to more agricultural protection (measured as the difference between domestic and international price levels) (Thomson 2016). Moreover, the size of the agricultural sector mattered for protection in late nineteenth century Europe (Swinnen 2009).
Earlier empirical work suggests that farmer's preferences over consumption taxes are not as straightforward. For example, many farmers in nineteenth century in England were salaried workers on larger farms and spent 90 % of their income on grains and potatoes (Swinnen 2009). Thus, both taxes on land and consumption hurt this group. Since the economic structure of the agricultural sector varies over time and between countries, it is not possible to determine a clear preference.
Because income taxes generally hit the more well-off harder than low-income farmers, previously disenfranchised rural poor should support higher taxes on income. Moreover, an income tax has a broader base than a land tax, so taxpayers employed in the agricultural sector should prefer shifting taxes from land to income. Even if the land is not owned by the poor farmer, shifting taxes unto the urban rich is still better than raising land taxes since this will in part affect the wage of the farmer.
In sum, poor rural farmers are mainly concerned with reducing taxes on land. Since income taxes hurt the well-off relatively harder, farmers should prefer shifting taxes from land to income. However, their preferences over consumption taxes are ambiguous.

Urban Poor
An individual spending most of his/her income on basic necessities such as food and clothing will be more hurt by a tax on these goods than an income tax, even if both of these are proportional. This is why taxes on consumption are generally regressive in their impact (Joumard et al. 2012;Prasad and Deng 2009). The regressive impact of consumption taxes is not a modern phenomenon, in late nineteenth and early twentieth century Sweden, for example, taxes on basic goods disproportionately hurt the urban poor (Wicksell 1908). This was closely linked to politics as the farmers and the upper classes managed to shift taxes from property onto consumption, which meant that the poor working class carried a heavier burden (Wicksell 1898), a burden that would decrease with extended franchise (ibid.). 4 Indirect taxes also fell heavily on the poor in nineteenth century France and in the UK ( (Bonney 1995):87, (Wicksell 1908)). The urban poor was not only disproportionately hurt by consumption taxes, but they also actively opposed them. An indicator of this opposition is the position taken by left-wing parties. While specific positions on tax policy changed during the century, a major difference between the left and the right remained the relative weight put on income and consumption taxation. For example, both Labor in the UK and the Social Democrats in Sweden fought for abolition of consumption taxes and heavier income taxes in their early electoral manifestos (The 1918Labour Party General Election Manifesto (2000 and the 1911 Social Democratic party program (Misgeld 2001)).
Political representation of the urban working class also affects taxes on property. Intuitively, the urban working class is better off if taxes are shifted from taxes on consumption to taxes on incomes and/or property. This preference for higher taxes on property (and especially land) is clearly present in the impact of late nineteenth and early twentieth century organized labor. In the UK, the influence of the Labor party in late nineteenth century led to increased taxes on land, and in Australia, the Labor party introduced a progressive land tax already in 1910 (Barnes 2011).
In sum, low-income urban workers prefer to shift taxation from consumption and unto property and income.

Summary
The rural poor wants to shift taxation from land unto income, but it is not clear if they prefer increased or decreased consumption taxes. Taxes on consumption hurt urban workers especially hard, and this group wants to shift taxation from consumption to land and income; taxes that hurt other groups more, in particular high income earners and farmers. Table 1 below summarizes the tax preferences of the urban working class and the rural farmers.
It is important to point out that there is not a straight channel between preferences and outcomes in terms of tax shares. Administrative constraints can mute the effects of preference representation. Generally, a lower collection cost of a certain tax is associated with a higher share of revenue from this source (Kenny and Winer 2006). Urbanization and a larger manufacturing sector lead to economies of scale, making self-employment comparatively less attractive, decreasing evasion and increasing tax revenue (Kau and Rubin 2002;He 2013). Moreover, a concentration of people and industry made collection and enforcement of some taxes (for example income tax) less costly (Riezman and Slemrod 1987). Similarly, Tilly (1992:49) observes that the growth of cities led to an increase in excise taxation. Since urbanization decreases collection costs of income and consumption taxation (Aizenman and Jinjarak 2008), this makes the impact of democracy stronger in more urbanized states.
In sum, as a result of administrative/collection costs, the impact of democracy on consumption and income taxes is expected to be stronger in more urbanized countries.

War, Representation, and Tax Policy
Earlier research on the impact of representation is largely concerned with redistribution, and the empirical results are mixed. While some find that democracies redistribute more than non-democracies (Lee 2005;Mueller and Stratmann 2003), others find no such effect (Ansell and Samuels 2010). Importantly, taxation does not equal redistribution, and earlier studies employing historical data (and focusing on franchise) find no effect of improved representation when considering income (Aidt and Jensen 2009b) and wealth taxation (Scheve and Stasavage 2010). 5 Interestingly, evidence from more recent transitions  shows that democracy is associated with increased regressive taxes on consumption but unrelated to progressive taxes on income and capital (Timmons 2010b). 6 Historical research into the origins of income taxation finds that extended franchise decreases the probability of adopting an income tax (Aidt and Jensen 2009b;Mares and Queralt 2015).
Instead of focusing on franchise extension in isolation, Aidt and Jensen (2009a) posits that the impact of political reform depends on tax collection costs, which are decreasing with literacy. Using data from 1860 to 1938 covering ten countries in western Europe, they show that the impact of franchise extension is only positively associated with the share of direct taxes when the enrollment rate (proxy for collection costs) is fairly high. But, by lumping together land and income taxes, they ignore the conflict between rural and urban voters.
There is also a rich literature on the effect of different democratic institutions on taxation and redistribution (e.g., Hettich and Winer (1999), Gould and Baker (2002), and Iversen and Soskice (2006)). However, since this paper centers on the democracy/non-democracy distinction, the effects of differences within democracies fall beyond the scope.
Another important literature is concerned with the impact of interstate warfare on taxation. Whereby the observed increase in tax revenue is hypothesized to be caused by mounting spending pressures, most significantly the impact of interstate warfare; war is costly, and the ability to finance increasingly larger armies likewise resulted in rising demand for state revenue. Existing revenue streams were not able to keep up with the costs of war and this created a need for new types of taxes (Hintze (1970) and Tilly (1992) see also Campbell (1993)). While the empirical evidence is not conclusive, there is data supporting the argument that spending pressure was an important determinant for the adoption of income tax (Aidt and Jensen 2009b) and that conscription during World War I led to more progressive income taxes (Scheve and Stasavage 2010). It has also been shown that war pressure was, under certain institutional circumstances, positively related to fiscal capacity in the early modern period (Karaman and Pamuk 2013). Since all citizens are equally protected, war spending has no strong distributional effects, and this makes the revenue aspect even more important in terms of distributional conflict. The argument linking interstate warfare to heavier tax burdens has been criticized for ignoring the temporal order of events (Morgan and Prasad 2009) as well as being constrained to Europe (Centeno 1997). While much attention has been directed towards the causal link between war and progressive taxation, it is also important to consider times of peace and a broader range of taxes. That a country adopted a specific tax or that the rate is high or low does not inform us about the importance of the tax in terms of state finances; the share of total revenues does. Furthermore, by including eleven Latin American countries, this paper has the potential to advance our knowledge beyond the historical experience of Europe. What is clear, however, is that war (or threat of war) must be included as a control variable.

Summary and Hypotheses
Democracy affects taxation through the representation of preferences. These are in turn partly determined by residence: urban and rural voters hold different views on how state revenue is to be generated.
Democratization in a rural country should lead to lower property taxes and heavier taxation of income. However, if the urban tax base is small, it is not certain that increased rates would yield enough revenue to offset a lower property tax. As mentioned above, administrative costs suggest that effects on income and consumption and excise taxes will be weaker in rural countries.
Since the urban working class prefers taxes on income and property over taxes on goods, democratization in an urban country is expected to lead to a higher share of revenue from income and property taxes and a lower share from excise and consumption taxes. This yields the following two hypotheses: • H1: Democratization in a country with a high urbanization rate leads to: (a) higher

shares of income and property taxation and (b) lower shares of excise and consumption taxes
• H2: Democratization in a country with a low urbanization rate leads to: (a) a lower share of property taxes and (b) a higher share of income taxes.

Empirical Strategy
Earlier research has been constrained either in time (by focusing on the post-Second World War era) or space (by focusing mainly on Europe). Relying on post-war data means leaving out important variation in the developed countries of the world and many of the most dramatic changes in tax structure. Not only is prior knowledge limited temporally, we also know much less about the relationship between taxation and representation outside of Europe. The new dataset used in this paper is a first step towards addressing these shortcomings. In order to test the argument outlined above, I will provide descriptive as well as multivariate evidence. The first section describes the dependent and independent variables. Then follows descriptive results and multivariate analyses. The final section consists of robustness checks and conclusions.

Dependent Variables
The dependent variables are shares of total central tax revenue. Focusing on shares have several advantages over focusing rates. 7 First, rates can be used not only to generate revenue, but also to discourage a certain behavior. For example, the state can tax tobacco for public health reasons, and the rates can be set so high as to completely eliminate tobacco use, and consequently, no tax revenue is collected from this source. A second issue with rates is that their effect depends on enforcement and tax administration. A high nominal rate that is never enforced is not likely to generate much resistance. Shares on the other hand are indicators of the resulting yield from different taxes. An obvious problem with shares is that they are affected not only by changes in rates, but also by other factors, such as total tax revenue. Some of these issues are addressed in the "Multivariate Analysis" section.
Data on the dependent variables was collected in collaboration with Thomas Brambor and covers 31 countries from 1800 (or independence) to 2012. 8 As far as we know, there exists no comprehensive historical dataset on public finance. Even for OECD member states, no cross-national database provides information from the nineteenth century up to today. Our dataset does so and is based on secondary sources providing partial temporal or geographic coverage. 9 Country-specific information 7 Of course, including both shares and rates would be ideal, but time constraints only allowed us to collect information on one of them. 8 The countries included are Argentina, Australia, Austria, Belgium, Bolivia, Brazil, Canada, Chile, Colombia, Denmark, Ecuador, Finland, France, Germany, Ireland, Italy, Japan, Mexico, New Zealand, Norway, Paraguay, Peru, Portugal, Spain, Sweden, Switzerland, The Netherlands, the UK, the USA, Uruguay, and Venezuela 9 For example, Astorga et al. (2010), Flora et al. (1983), Mitchell (2007), and OECD (2012). was used to adjudicate between different sources and to judge the quality of the data. The overall aim was to create internally consistent time series that connects to contemporary datasets, an approach suitable for fixed effects models (like the ones used in this paper). A more thorough description of the coding process is provided in the appendix and the codebook. Total tax revenue is disaggregated into direct (property and income) and indirect (customs, excises, and consumption) taxes. Property taxes include taxes on real estate, wealth, and land. The income tax category includes taxes on income, profits, and capital gains by individuals and corporations as well as taxes on payrolls and workforces. Ideally, these categories should be measured separately, but the available sources rarely allow for a more fine-grained categorization. Tax on consumption consists mainly of sales and turnover taxes prior to the introduction of the value-added tax in the 1960s. Excises are taxes on specific goods, for example, tobacco or alcoholic beverages. The difference between consumption taxes and excises is that the former are broad-based and the latter only cover specific goods. For the hypotheses outlined above, the important aspect of both consumption and excise taxes is their regressive impact, hence they are collapsed into one category in the empirical section.
Restricting our dataset to the central level allows for a much larger sample, both in time and across countries, but it is problematic in cases where there is significant subnational authority over taxing and spending as this revenue will not be included. Another important aspect of the data is that the sample varies over time. Countries are only included once they are independent, which means that the sample is smaller in the beginning of the nineteenth century. This also means that European countries are overrepresented in earlier years.
Even though the sample used in this paper is the widest available, it still excludes a significant part of the world. For example, it contains no countries from Eastern Europe or Africa, and only one country in Asia (Japan). Data availability also means there is a bias towards developed countries. However, the universe of possible cases also changes over time, for example, in the beginning of the twentieth century, there were only 55 sovereign states (Karatnycky 2000).
In order to achieve maximum coverage in time and across space, we had to work with broad categories of taxes, ignoring sometimes important differences within categories (e.g., types of property taxes).
For more information about the dataset and coding, the reader is referred to the Appendix and the codebook.

Independent Variables
As explained in the theoretical section, not just representation, but effective representation is key to my argument. Changes in political institutions must reflect the inclusion of the previously disenfranchised poor and also the translation of preferences into policy. Boix et al. (2012) measure democracy based on both participation and contestation, which fits well with the theoretical concept. Participation is conceptualized as suffrage rights for the majority of the male population and contestation means that the executive is directly or indirectly elected in free and fair elections (ibid.). Alternative indicators of democracy are evaluated in the sensitivity analysis.
Urbanization is measured as the proportion of the population living in cities with 20,000 or more inhabitants and is from Banks and Wilson (2012). This particular level is chosen based on the study by Aidt and Jensen (2009b), which covers a similar time span. Alternative operationalizations of urbanization are considered in the sensitivity analysis.
A general process of modernization and economic transformation affects both urbanization and the likelihood of democratization. Economic development is also related to the technical aspect of tax collection costs. Thus, a crucial control variable to include is GDP/capita (the data is from Maddison (2007)). 10 As outlined in the "War, Representation, and Tax Policy" section, taxation is linked to interstate warfare, and I have included an indicator for war. The data is from the Correlates of War dataset (Singer et al. 1972), where war is coded as 1 for every year a country is involved in an interstate war. However, the state does not necessarily have to be directly involved in armed conflict to experience a threat; if close neighbors are at war, there could be reason to upgrade and expand the national defense (which is costly and might affect taxation). Moreover, strong defensive capabilities can deter attack. In the sensitivity analysis, two alternative measures are evaluated: number of military personnel and military expenditure (also from the Correlates of War dataset).
Finally, as a control for the general level of fiscal capacity, I include a measure of total central tax revenues as a share of the economy.

Results
Table 2 below shows the expected results of the hypotheses. This reflects both the expectations based on the preferences of urban and rural voters, as well as the impact of administrative capacity that is expected to mute the effect of preferences in some cases (represented by a "+/−" in the case of the impact of democracy on excise and consumption taxes in a low urbanization context). Spain (1893( -1930( , 1937( -1976( ) Brazil (1862( -1929 Chile (1892-1908, 1925-1933, 1973-1989) Austria  Netherlands (1830-1896) Portugal

Descriptive Results
To provide an overview of the sample in relation to the variables of interest, Table 3 shows four different combinations of urbanization (high or low) and democracy (yes or no) with episodes from countries in the dataset. The threshold between low and high urbanization is set the point where the proportion of the population that lives in cities of 20,000 or more is twenty percent. Roughly, 50 % of all country years are below this value. Democracy/non-democracy is indicated by the dichotomous measure from Boix et al. (2012). Note that this is only a sample of country periods from the dataset since many countries occupy different cells during different time periods. What we can see immediately is that the combination of low urbanization and non-democracy becomes increasingly rare with time.
The relationship can also be explored graphically. Figure 1 shows the relationship between urbanization and the share of income taxes for democracies and nondemocracies. 11 In addition to the country year scatter plot, the graph includes a lowess curve. The figure reveals that revenue from income tax increases with urbanization in democratic countries while the relationship is not present in undemocratic states. This pattern is consistent with the notion that democracy in combination with high urbanization leads to a greater reliance on income tax. Importantly, if urbanization had an independent impact on income tax (through, for example, lower collection costs), the pattern should be the same for both democracies and non-democracies. 12 Figures 9 and 10 (in the Appendix) show that the share of property taxes decreases while that of excise and consumption taxes increases with urbanization in non-democracies. If urbanization lowered collection costs for property taxes, we would observe the opposite relationship. In democracies, the share from excise and consumption taxes increase with urbanization up to a point to then decrease.
Since urbanization is almost universally increasing in time, these descriptive graphs do not tell the whole story. The figures also hide important within-country variation.
A third way of illustrating the development over time is examining individual countries. For example, the UK is considered a "high urbanization" country throughout the period but only a democracy for roughly half of the time span. Figure 2 graphs the development of income, property, and excise and consumption taxation in the UK from 1800 to 2012. The shaded area indicates democracy and ticks mark the beginning and end of the two world wars. In the period before democratization, Britain relied more on excise and consumption taxation than income tax, while the latter dominates in democratic years. Note also the clear ratchet effect of the First World War; the subsequent decline in income tax share was not enough to offset the increase (on the ratchet effect of war in the UK, see Peacock and Wiseman (1961)).
A country that experienced both rural and urban democracy is France. Figure 3 shows the development of excise, property, and income taxes in France from the early nineteenth century to 2012. The light shade of gray indicates years with democracy and low urbanization while the darker shade of gray represents democracy and high urbanization. The first session of democracy is followed by a decrease in the share of revenue coming from property taxes and an increase in excise tax revenue, consistent with rural tax preferences. Interesting to note is that urban democracy is not associated with a marked increase in income tax revenue until the onset of World War I, illustrating the plausible effect of interstate warfare.  While informative, these descriptive explorations can only take the analysis so far. In the next section, I explicitly control for war and other possibly confounding variables.

Multivariate Analysis
It takes time before the effects of democracy on taxation are realized. First, the effect of constitutional changes such as an extension of suffrage is not instantaneous. Second, there is a process of learning involved in which voters and parties interact. For example, it takes time for newly enfranchised citizens to organize into effective parties. Third, there might be a lag between the move to democracy and the next election. Finally, changing the tax mix takes time: there is a delay between the implementation of a tax law and a measurable impact on government revenue. For all these reasons, I have chosen a dynamic error correction model (ECM) that in a simple and straightforward way partitions short-run and long-run effects. 13 An easy way to think about an ECM is to consider the long-run relationship between X and Y as an equilibrium. Shocks to the equilibrium relationship can have an immediate effect, but the adjustment to a new long-run relationship is allowed to take time (De Boef and Keele 2008). 14 For instance, democratization can have an immediate effect on taxation, but the total impact of democracy will probably take considerable time. Throughout the analysis I will focus on the long-run effects of democracy.
An additional advantage of the ECM is that possible serial correlation can be mitigated by the inclusion of one or more lags of the dependent variable (Beck and Katz 1995). Importantly, including a lagged dependent variable that does not eliminate serial correlation will lead to bias, while serial correlation in the absence of a lagged dependent variable will not (Wilson and Butler 2007). Lagrange Multiplier tests were used to determine the number of lags needed to eliminate serial correlation (as recommended by Beck and Katz (1996)). 15 The main specification includes country-specific intercepts to control for unobserved heterogeneity across units 16 , and I include fixed effects for years to control for common shocks.
The dependent variables are all measured as shares of total tax revenues. Although these shares do not add up to one (customs revenue is not included, for example), they are still likely to be dependent. That is, the share of income taxation might affect and be affected by the share of excise and consumption taxes as well as property taxes. Dependence across models means that estimating three equations separately is inappropriate. Therefore, the results are obtained using the seemingly unrelated regression approach (SUR), which assumes that the error terms are correlated across equations (Zellner 1962). 17,18 The hypotheses concern the impact of democracy on shares of tax revenue conditional on urbanization. In order to evaluate these conditional hypotheses, I include interaction terms in the regressions and provide graphs illustrating marginal effects.
Formally, each equation in the system is estimated as The dependent variable, DV, is differentiated and included in lagged levels at the right-hand side. p refers to the number of lags. The interaction effect and its constituent terms are added in lagged levels and changes. X is a vector of control variables, δ and ζ are fixed effects for countries and years, respectively.
In total, four equations are estimated in a SUR system. Table 4 reports regression results for the share of income, property, and excise and consumption taxes. The residual category (mainly taxes on trade) is suppressed for presentational purposes. This table shows only the long-run effects calculated using the Bewley (1979) transformation (De Boef and Keele 2008). 19 These long-run multipliers are the bases for the marginal effects graphs used to evaluate the hypotheses. I will comment on each dependent variable separately and discuss the control variables in the summary.

Income Taxation
According to H1a, the effect of democracy should be positive and stronger for higher levels of urbanization. This conditional impact is assessed in column 1 in Table 4. The coefficient of the interaction effect of democracy and urbanization is positive and statistically significant at the 1 % level. An effective way of exploring the hypothesis is to graph the impact of X (democracy) on Y (share of income tax) for different values of Z (urbanization) (Brambor et al. 2006). Figure 4 shows the conditional long-run effect of democracy for a range of urbanization rates. The rug plot at the bottom of the graph illustrates the empirical range of the urbanization variable and reveals that the number of observations decreases substantially for urbanization rates over 0.8. Figure 4 shows 17 Added in the estimates (but not in the results table) is a category representing customs and other tax revenues not included in the list of dependent variables, in order for the shares to add up to one hundred. 18 This is also the approach taken by scholars analyzing similar data, e.g., Timmons (2010b). 19 Table 7 in the appendix shows the original ECM regressions. The Bewley transformation is a method of calculating the standard errors for the long-run multipliers recommended in De Boef and Keele (2008). Important to note is that this is a statistical transformation, not a model. This means that the R 2 should not be interpreted (ibid.).

Standard errors in parentheses
Constants estimated but not reported Year and country fixed effects included in all models * p < 0.10; ** p < 0.05; *** p < 0.01 that the impact of democratization is positive and significant for higher levels of urbanization while the impact is negative-but not distinguishable from zero at the five percent level-for lower levels. This means that the long-run effect of democracy depends on the urbanization rate in a way that is consistent with H1a. If we consider a country with an urbanization rate of 55 %, a move to democracy is associated with a long-term increase in the income tax share of approximately 2.6 percentage points. In contrast, a move to democracy at an urbanization rate of 10 % is associated with an almost two percentage point long-term decrease in the share of income tax revenue (however, this latter effect is not statistically significant).
H2b expects democracy to increase income taxes also in low-urbanization contexts since rural farmers prefer income taxes to taxes on land. However, the effect was expected to be weaker because of administrative costs: if the majority of the population lives in the countryside, administrative costs associated with collecting income tax are high, lowering possible yield from this source. As we can see from Fig. 4, the impact of democracy at low levels of urbanization is not statistically distinguishable from zero, hence H2b is not supported by the data.
Taking urbanization into account shows that the positive relationship between democracy and income taxes is only present in more urbanized states. This result can explain why (Timmons 2010b)

Taxation of Consumption
Hypothesis 1b states that democracy in combination with higher levels of urbanization should have a negative effect on excise and consumption taxes. Column 2 in Table 4 shows that the coefficient of the interaction term is negative but just short of reaching statistical significance (p = 0.102). A more effective way of evaluating the hypothesis is to examine Fig. 5 which plots the marginal effect of democracy for different levels of urbanization. The figure shows that the effect is always negative and suggests that this impact is stronger for higher levels of urbanization. If we consider a country at the third quartile level of urbanization (36 %), democracy is associated with a long-term decrease in excise and consumption taxes of 5.3 percentage points. However, this effect is not statistically significantly lower than the effect of democracy on the first quartile level of urbanization, −3.9. In sum, the results lend support to H1b, although with rather imprecise estimates.
This result is interesting considering earlier work showing a positive impact of democracy on consumption taxes (Timmons 2010b).

Property Taxation
H2a states that if effective representation is extended in a rural country, revenues from property taxes will decrease. In column 3 of Table 4, the data provides clear support for this hypothesis. The coefficient for democracy when urbanization is zero is both negative and statistically significant. The interaction term is positive and significant.  Figure 6 shows that the long-run impact of democracy on property taxation is negative and statistically significant for low levels of urbanization while the effect is positive-and statistically significant-for high levels.
For a country at the first quartile of urbanization (10 %), democracy is expected to decrease property taxes with 3.4 percentage points while for a country at the third quartile (36 %), the effect is significantly weaker and indistinguishable from zero. For very high levels of urbanization, for example, 55 %, the impact is positive (two percentage points) and statistically significant. In line with the hypothesis, the data suggests that the lower the urbanization, the stronger the negative effect of democratization.
H1a predicts that democracy is associated with higher property taxes in highly urbanized states since urban workers want to shift taxation from consumption unto income and property. The results reported in column 3 of Table 4 as well as the patterns in Fig. 6 clearly supports this.

Summary of Results
The data provide strong evidence in support of H1a and H2a: democracy leads to greater shares of income and property tax in highly urbanized states while it reduces the share of property tax in rural countries.
With regard to H1b, the results do not reject the hypothesis-the impact of democracy for high levels of urbanization is negative-but the impact is also negative for low levels of urbanization, weakening the results. The expectation was that lower administrative costs associated with taxing consumption in highly urbanized states would make the impact of democracy more pronounced, but this is not realized in the data. Finally, with regards to H2b, the data reveals no statistically distinguishable impact of democracy on income tax share when the urbanization rate is low. This could be a result of rural countries having fewer salaried workers and the income tax yielding more revenue in urban countries. Importantly, this is not the result of economic development since the models include a control for GDP/capita. Fiscal capacity (tax revenue/GDP) is associated with a higher share of income tax and lower shares of property, excise, and consumption tax. The general level of economic development (GDP/capita) is positively related to all three tax categories. Unsurprisingly, war is positively related to the share of income tax. Importantly, democracy still has an effect, contrary to the arguments in Scheve and Stasavage (2010). Interestingly, war is negatively related to the share of property taxes. Table 5 below shows the sign of the (statistically significant) effect of democracy conditional on urbanization for the different dependent variables, as well as a column indicating whether the results supported the hypothesis or not. As mentioned above, H1a and H2a are clearly supported by the data, and while the evidence does not reject H1b, the support is somewhat weaker since the effect seems to be negative regardless of the level of urbanization. The results for income and property taxation are roughly consistent with those reported in Aidt and Jensen (2009a), although that study groups the two taxes together and focuses on administrative costs. The next section examines the robustness of these results to different measurements and specifications.

Sensitivity Analysis
The independent variable is democracy, and in the previous section it was measured by a dichotomous indicator from Boix et al. (2012). There are a large number of conceptualizations and measurements of democracy (see (Hadenius and Teorell 2005)), but few cover the time span of over two hundred years. An exception is the widely used Polity index (Marshall and Jaggers 2002). A re-analysis was made using a dummy variable where a Polity score of seven or more is coded as democracy (the cut-point is recommended by Jaggers and Gurr (1995)), and the results remain the same. 20 One of the elements in the Boix et al. (2012) concept of democracy is male franchise. If there are gender specific preferences for taxing and spending, the effect of female franchise might be different (e.g., Campbell (1993):169). To evaluate the sensitivity of only considering male franchise I constructed a dichotomous variable indicating full (male and female) franchise or not, using version two of the PIPE dataset (Przeworski et al. 2013). The results remain essentially the same, although somewhat weaker for H2a.
Since the expectation or threat of war can have similar effects as actual armed conflict, the binary indicator used above might underestimate the effect. If a country experiences a threat of war, a likely response is to upgrade and expand the armed forces. The number of people employed in the armed forces and the level of military expenditures might thus be correlated with a heightened expectation of armed conflict. As noted above, military personnel and spending can also function as deterrence. If countries that have a large standing army and high defense spending are also less likely to get involved in a war, this means that the binary war variable can suffer from selection bias. Data on these two variables is taken from version four of the Correlates of War National Material Capabilities dataset (Singer et al. 1972). Using any of these two alternative measurements does not alter the conclusions.
I used two alternative ways to measure urbanization: the first variable (following Karaman and Pamuk (2013)) defines urbanization as the proportion of the population living in cities of 10,000 or more, and the second uses 50,000 or more as a cutoff point. Using the second measure of urbanization does not change the results, but when using 10,000 as cutoff point, H2a is rejected.
Another concern is the inclusion of fiscal capacity (tax revenue as a share of GDP) in the models above. Since this might induce post-treatment bias, I have re-run the regressions with this variable excluded. The results remain the same.
A related issue is that the analysis only compares early and late democratizers: early democratizers had a larger rural population that later democratizers. This problem is potentially worsened by the fact that many of todays developing countries are not in the sample. However, recent evidence from the post-Second World War period including both developed and developing countries find that democratization leads to lower taxes on agriculture, employing similar assumptions about the preferences of the rural poor (Olper et al. 2014). As a robustness check, I have rerun the models for two separate time periods. Section D of the appendix shows results for 1800-1913 and 1913-2012 separately. For the period from 1800 to 1913, the results remain the same, they even provide stronger support for the hypotheses than when using the entire sample. However, focusing only on the 1913-2012 era, the results do not hold. This may have several causes. First, by 1913, many countries had already gone through major shifts in taxation-changes that in part was a result of previous changes to the political system. If the relationship between democracy and taxation is a path-dependent process taking a long time, measuring only part of this period is likely to affect the results. Since the number of rural countries (both democracies and non-democracies) decreases with time, crucial variation is lost when focusing on 1913-2012 only. Similarly, the number of democracies also changes over time.
Another issue is that the relationship between democracy and taxation might be different in Europe compared with non-Europe. Applying the same model as in section "Multivariate Analysis" on separate geographical samples generates the same results, apart from the results for property taxes in the sample excluding European states. The complete results are available in section D of the Appendix.
Although there are several reasons to choose the somewhat involved model presented in the main analysis, the results hold using much simpler techniques. Models 1-3 in Table 12 (in section E.1 of the Appendix), show results for a simple OLS model with only the variables of interest included (in levels). The results are in line with conclusions in the main analysis. Models 4-6 add panel-corrected standard errors (as recommended by Beck and Katz (1995)) as well as country and time-fixed effects. Again, the results remain essentially the same. Finally, models 7-9 add the same control variables as in Table 4. Overall, the results are unchanged and even stronger.
Another concern with the econometric specification is that it does not address the fact that the dependent variables are bounded between 0 and 100. In section E.2 of the Appendix, I present results using fractional regressions taking this property of the dependent variables into account. These models are identical to the ones in Table  12 (section E.1), but employ logistic models and robust standard errors. Using this approach does not change the results.
In sum, the results are fairly robust to alternative specifications and measurements. One exception is H2a, which is not supported when using a different cutoff for urbanization or when excluding European countries from the sample. This means that the confidence regarding H1a and H1b is strengthened while the results for H2a should be interpreted more carefully. Moreover, the results change when restricting the sample to the time period between 1913 and 2012.

Conclusion
In this paper, I have argued that the impact of democracy on tax structure is conditional on urbanization. Both descriptive and multivariate analysis suggest that democratization in combination with high urbanization is associated with a greater share of tax revenue from income and property taxes and a lower share from excise and consumption. I also find that the effect of democracy on the share of property taxes is negative when urbanization is low. These results depend neither on the indicators used nor the particular econometric specification. Overall, the empirical evidence indicates that the preferences of different classes should be taken into account when considering the impact of democracy on taxation.
It is important to emphasize that the results presented in this paper are macro-level patterns over two centuries. Thus, the interpretation of the results should be mainly descriptive and further research is needed into the proposed causal mechanism. The link between urbanization and public opinion needs to be grounded in better data, even if this means focusing on a few comparative case studies.
A drawback with the theoretical framework employed in this paper is that it does not generate predictions for taxes on trade. A theoretical expansion and refinement should address this important aspect of taxation.
Finally, I have deliberatively contrasted democracies with non-democratic states. But since different political institutions such as electoral systems have an impact on geographical representation (Rodden 2010), further research should take differences within democratic systems into account.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. a country over time and connects to contemporary datasets (such as the OECD for European and North American countries and CEPAL for South America) in order to allow easy updates of the dataset. An advantage of this approach is that it is suitable for fixed effects models employed in this paper. We also aimed at minimizing the number of sources for each data series while keeping high coverage over time. Our main interest is to explain long-run trends within countries, so in situations where we needed to prioritize between using one source to obtain cross-country consistency and employing different sources to reach within-country consistency, we preferred the latter. In the codebook, we list all the country-specific primary and secondary sources that we used in the final version of the dataset.
The total tax revenue of the central state is disaggregated based on the guidelines set out in the Government Finance Statistics Manual 2001 of the International Monetary Fund (2001), but given the paucity of historical data and the focus of our project, we combined some categories. Total central tax revenue is divided into direct and indirect taxes. Further, two subcategories of direct taxes are measured separately: taxes on property and taxes on income. The subcategories of indirect taxes are consumption taxes, excises, and customs (which includes taxes on import, exports, exchange profits, and profits from export/import monopolies).
The first important choice is our focus on central government revenue. Although including all levels of government would be preferred, this was not possible given our ambition regarding temporal and geographical coverage. The focus on the central level means that the overall level of taxation is biased downward in federal countries like the USA or Germany. We chose to express different categories of the budget as shares of total central tax revenue in order to facilitate cross-country comparisons.
The second important choice was how to combine different sources. We have collected information on public finance and economic development from a number of existing data sources. Several of these datasets relied partly on the same underlying data. Nonetheless, many estimates of our variables of interest differed substantially, especially further back in time. Since many of the sources overlapped, one method is to average values. This approach is problematic for a number of reasons. First, because many sources rely on the same underlying data, averaging would hide the potentially duplicated sources. Secondly, coverage in time varies substantially between countries. Averaging would mean that some sources are overweighted. Finally, since the quality of secondary data differ considerably, averaging might increase rather than decrease measurement error.
Instead of averaging across different sources, we followed a decision tree to decide which sources to use as the basis for our estimates. The following rules were used to guide the coding: (1) minimize the number of sources. If several sources cover the same information, we prefer to use a single source across categories within the same time period. (2) Prefer high-quality sources. We prioritize primary and countryspecific secondary sources. Since these sources often provide more detailed data, this meant that we needed to do some of the categorization ourselves. However, many of the cross-national datasets were of such a high quality that we confidently relied on them for parts of the dataset. (3) Check the consistency of sources. When relying on two or more sources to construct a long-run series, we made certain that the information is comparable when covering the same overlapping time period within a country. In cases of a significant jump at the intersection of two series, this is indicated by coding the last value of the ending series as missing. (4) Time series consistency trumps cross-sectional consistency. As mentioned above, our main interest is long-run trends within countries. When deciding whether to use the same source to obtain cross-country consistency or using different sources to reach consistent within-country estimates, we prefer the latter.
The tax categories are expressed as ratios, for example the share of the revenue coming from income taxes. We strongly preferred the numerator and denominator coming from the same source. Table 6 reports the temporal coverage per country and tax category. There are three main reasons information does not extend all the way back to 1800. First, the country might not exist or is not independent. Second, the country exists, but not the tax. And third, the country and the tax exist, but there is no data. Since it is not always known if data is missing because the tax was not in effect that year, if it was not collected, if information is just missing, or if it was a war or some other major crisis, the dates in the table below are from the year we first have data to the last year of data. When the interval contains several larger gaps, I have marked it with an asterisk ( * ). These gaps have different causes. For example, a tax can be removed (e.g., income tax in nineteenth century Britain) or data can just be missing because of war or occupation.

A.2 Trends Over Time
This section provides figures outlining average trends in taxation over time. Figure 7a maps the development of the size of government as well as direct and indirect taxes among the countries in the sample from 1800 to 2011. Figure 7b, c shows the patterns for the subcategories of direct and indirect taxes, respectively. Finally, Fig. 8 provides the average shares over time for the three categories under investigation in the paper.

B Figures 9 and 10: Property and Excise and Consumption Taxes
Figures 9 and 10 show descriptive scatter plots for property and excise and consumption tax shares and their links to democracy and urbanization. Table 7 below shows the results from the complete ECM models on which the longrun effects displayed in Table 4 are based.

D Results for Subsamples
In this section, I evaluate four different subsamples: two geographical (with and without Europe) and two temporal (1800-1913 and 1913-2012). Table 8 reports results from the same models as in Table 4, but restricting the sample to European countries. Table 9 reports the same results as Table 4, but excluding European countries from the sample. It is important to point out that data on property taxation is missing Year and country fixed effects included in all models * p < 0.10; ** p < 0.05; *** p < 0.01 Year and country fixed effects included in all models * p < 0.10; ** p < 0.05; *** p < 0.01 completely for a number of non-European countries, which reduces the observations for all equations in the system. Table 10 shows the long-term conditional effect of democracy on income, property, and excise and consumption taxes. The models are identical to the ones estimated in Table 4, but the sample is restricted to the time period from 1800 to 1913. Table 11 presents results from models identical to the ones in the main specification in Table 4, but with a sample restricted to the period 1913-2012. Year and country fixed effects included in all models * p < 0.10; ** p < 0.05; *** p < 0.01

E Alternative Econometric Specifications
In this section, I report results from two alternative specifications. First, singleequation static time-series cross-section models employing panel-corrected standard errors and time and country fixed effects. Second, fractional regressions taking into consideration that the dependent variables are bounded.

E.1 Results from Single-Equation Models
Table 12 shows results from very basic single-equation estimations. Models 1-3 are based on the simplest possible OLS regressions, while models 4-6 adds panelcorrected standard errors (Beck and Katz 1995) and year and country fixed effects. Finally, models 7-9 include a full set of controls. Year and country fixed effects included in all models * p < 0.10; ** p < 0.05; *** p < 0.01

E.2 Results from Fractional Regressions
As a reviewer pointed out, my dependent variable cannot take values smaller than 0 or larger than 100. Fractional regression takes this into account and the models in Table 13 use logistic fractional regression. Otherwise, the models are identical to those in Table 12, apart from that the models in Table 13 employ robust standard errors while the models in Table 12 reports panel-corrected standard errors. The model is logistic, making the estimates in Table 13 difficult to interpret. Figure 11 graphs the impact of democracy for different values of urbanization. The graphs are based on models 4-6 (that is, the models with full controls). We can immediately see that all graphs tell the same story as the ones in the main results section. Note that the scale of the dependent variables in these models is 0-1 (in the main results section, the scale is 0-100).