1 Introduction

The increase in the relative size of the public sectors of developed economies over the last century is well documented in the literature (see Hindriks & Myles, 2006). A wide range of explanations and associated variables have been proposed to explain the historical evolution of the role of government, especially its absolute and relative growth over time. Adolph Wagner’s (1883, 1893, 1911) formulation of the ‘law of increasing state activity’Footnote 1 was followed by Peacock and Wiseman’s (1961) displacement effect and Bird’s (1971, 1972) ‘ratchet effect’, Baumol’s (1967) law, Musgrave’s (1969) stages of development approach, and Peltzman’s (1980) model of the redistributive aspects of government activity. More recently, the literature has been extended by Rodrik's (1998) theory of trade openness, Alesina and Wacziarg’s (1998) theory of country size, models incorporating the role of political institutions such as electoral rules, governmental types, and political participation (Milesi-Ferretti et al., 2002; Persson & Tabellini, 1999), and Acemoglu and Verdier’s (2000) trade-off between market and government failure.Footnote 2

In terms of empirical research, over the last few decades we have witnessed a proliferation of studies aiming to test the existence of a long-run relationship between government expenditure and economic activity by means of advanced statistical and econometric techniques. After Henrekson’s (1993) study, unit root tests and co-integration analyses, with and without structural breaks, have become the standard tools in the relevant literature because of their ability to deal with non-stationary data. Using these sophisticated techniques, the process at the heart of public sector growth can be modelled as a linear long-run relationship, with or without structural breaks (e.g. Durevall & Henrekson, 2011; Peacock & Scott, 2000; Wagner & Weber, 1977). This empirical approach contrasts sharply with the complexity of the relationship to be estimated, at least over long time spans. Indeed, the observed long-run pattern of government growth encompasses periods of rapid economic expansion, increases in the size and scope of the public sector, wars, social upheavals, different phases of economic development, and several waves of technological changes. Hence, unless one takes as the estimation period the years from the 1960s onwards, during which a striking cross-country convergence pattern is generally observed, econometric models which specify fixed parameters and fixed structural equations are unlikely to capture the complex pattern of long-term government growth.

This plethora of theoriesFootnote 3 is symptomatic of the complexity of the subject. Any single theory taken in isolation cannot provide a satisfactory explanation of long-term government growth because public expenditure decisions derive from the complex interplay of multiple factors in which the economic factors of supply and demandFootnote 4 are mediated through political decision-making processes (Gerber & Jackson, 1993). Given this variety of theoretical explanations, instead of adding a new theory to those already existing we choose to take an empirical perspective on the main features and regularities that have characterised the historical evolution of the ‘size of government’ during the process of economic development.

As such, in this study we use a sample of 17 developed countries, observed from the late nineteenth century to date, to determine the extent to which existing, competing theories of government growth can explain the evidence on the long-run growth pattern of government expenditure. A ‘chaise longue’ pattern of the expansion of government size during the developmental process is clearly evidenced by non-parametric local regression analysis. The results of the Gregory–Hansen co-integration break tests show the existence of a long-run relationship between government expenditure and gross domestic product (GDP), with a GDP coefficient which provides evidence for the attenuation of this relationship after the mid-1970s. Moreover, the rolling regression analysis of the long-run relationship allows the detection of three periods of acceleration in the growth of government spending: the first two corresponding to periods of large-scale social upheavals, the third to the ‘golden age of public sector intervention’. Finally, we find that the historical trends of absolute and relative government expenditures display increasing synchronisation in the post-1960s period. The ‘ratchet phenomenon’ in the pre-WWII period and the change in the prevailing ideology from a focus on market failures to government failures in the second half of the twentieth century provide explanations complementary to Wagner’s law and consistent with the observed long-term evolution of the relative (and absolute) growth of government.

The paper is structured as follows. Section 2 presents the data and the literature. Sections 3 and 4 provide, respectively, the results of the non-parametric local regression analysis and the co-integration analysis at the individual country level. Section 5 discusses the time-varying pattern of the long-term relationship, while Sect. 6 examines whether and how the pattern of the long-term evolution of the growth in government spending differs when measured absolutely or relatively. An interpretation of the results is provided in Sect. 7, while Sect. 8 concludes the paper.

2 Dataset and literature

Researchers usually explain the long-run evolution of government activity by looking at the ratio of public expenditure to GDP.Footnote 5 Empirical works studying the relationship between government expenditure and economic activity have employed both time-series and cross-sectional frameworks. While the country dimension may vary from several units to more than 100, the temporal dimension is generally quite low, with the time span being mostly limited to a subsample of the post-WWII period (e.g. Afonso & Jalles, 2014; Akitoby et al., 2006; Brueckner et al., 2012; Kolluri & Wahab, 2007; Lamartina & Zaghini, 2011; Magazzino et al., 2015; Ram, 1987; Wahab, 2004). Very few studies use a longer time period; those that do include Henrekson (1993) for Sweden, Durevall and Henrekson (2011) for Sweden and the UK, Paparas et al. (2018) for the UK, and Kuckuck (2012) for five western European countries. Similarly, Oxley (1994) and Thornton (1999) use historical data from the mid-nineteenth century for Britain and six European countries, respectively, but limit their analyses to the pre-WWI period. The only paper to adequately exploit both the country and time dimensions is that by Easterly and Rebelo (1993), with historical data covering 26 countries from the late nineteenth century.

In this study, we use International Monetary Fund (IMF) data on government expenditure and GDP for several developed countries observed over a long time period: 1880–2018. These two dimensions are equally important for evaluating alternative explanations for the growth in the size of the public sector. A time span of a few decades may be insufficient to allow identification of any significant structural change that can be interpreted in terms of economic development, because of the long-term nature of the government expenditure–income relationship. Indeed, studies on economic development/growth generally take the longest possible time span as the study period in order to be able to identify different phases of economic development (Maddison, 1982). The inclusion of countries which are at similar stages of economic development but differ in terms of demographics, trade openness, and political organisation (i.e. their electoral rules, the size and type of their governments, the degree of political participation) in principle allows us to distinguish between alternative theories of government size, such as whether it is demand-driven or supply-driven.

Using a very long time span may be problematic in that historical time series are likely to exhibit short-lived transient components typical of wars or crisis episodes, such as abrupt changes, jumps, and volatility clustering. The treatment of war years may result in the inclusion of war interval dummies or, with missing values, in corrections applied to the original data through interpolation (Metz, 1992), or a priori elimination of their impact on the assumption that such shocks can be seen as disturbances in the normal structure of the data (Korotayev & Tsirel, 2010).Footnote 6 As shown in Table 3 of Appendix 1, 11 countries of the 17 considered in this study present missing data; in some cases the missing data are limited to a very short period (2 to 3 years), but in other cases the periods are longer. The missing data are concentrated either at the beginning of the sample or in coincidence with war periods (for several European countries). For countries with missing data within the sample, such as Belgium, Germany, Spain, France, Netherlands, and Norway, co-integration analysis was performed using a wavelet-based approximation of the aggregate series, which yielded unbiased and consistent estimates for the intercept and slope parameters, as in Ramsey et al. (2010), Gencay and Gradejovic (2011), and Gallegati and Ramsey (2012). In particular, wavelet multi-resolution decomposition analysis, by returning at each step a set of averages (along with a set of differences between adjacent averages) based on different window lengths, allowed us to obtain a collection of approximations of the original signal, from finer (S1) to coarser (S4) resolution levels.Footnote 7

The measurement of public sector expenditure depends on the definition of government. Since processes and levels of fiscal decentralisation differ greatly between countries, the ratio between the data for the alternative definitions of government expenditures—our variable of interest—may display a high degree of variability over time and across space. The IMF data refer to general government, whereas, for example, data on public expenditures in the dataset used by Jordà et al. (2017) refer to central government. However, since the countries included in our dataset (see Appendix 1 for a complete list) differ substantially in terms of fiscal decentralisation, we decided to use IMF data for the main analysis and the macro-historical dataset from Jordà et al. (2017) for robustness purposes.Footnote 8

3 Non-parametric local regression analysis

Due to the ambiguity of Wagner’s (1883) formulation of the ‘law of increasing state activity’, at least six different parametric versions have been proposed in the literature by, for example, Peacock and Wiseman (1961), Gupta (1967), Goffman (1968), Pryor (1969), Musgrave (1969), Goffman and Mahar (1971), and Mann (1980).Footnote 9 In this study, we use probably the most common formulation of Wagner’s hypothesis, as suggested by Musgrave (1969):

$$ {\text{Government}}\,{\text{share}} = \alpha + \beta \left( {{\text{per}} - {\text{capita}}\,{\text{GDP}}} \right) $$

where government share is measured as the ratio between government expenditure and GDP in nominal terms, and per capita GDP is measured in real terms (Fig. 1).

Fig. 1
figure 1

Non-parametric relationship between government expenditure (% of GDP) and per capita income using different bandwidths: 0.66 (upper panel) and 0.33 (lower panel)

Given the problems that arise from testing hypotheses about the growth in public expenditures, estimating a rigorously specified model using parametric analysis may appear to be an ambitious task because parametric regression analysis makes some strong assumptions,Footnote 10 and because of its rigidity in terms of fixed parameters and functional forms. By contrast, non-parametric analysis, by not imposing any specific form to the regression function in advance of examination of the data, avoids the risks of altering the specifications. Indeed, non-parametric regression analysis can capture the shape of a relationship between variables without us prejudging the issue, as this type of analysis estimates the regression function f(.) linking the dependent to the independent variables directly, without providing any estimates for parameters.

Therefore, we estimated the long-term relationship between government share and per capita GDP using non-parametric methods. There are several approaches available to estimate non-parametric regression models, and most of these methods assume that the non-linear functions of the independent variables to be estimated are smooth continuous functions. One of the most commonly used methods of non-parametric regression is called the locally weighted scatterplot smoothing (LOWESS or LOESS) method (Cleveland, 1979), and is an implementation of local polynomial regression. In the LOESS method, the regression function is evaluated at each value of the independent variable xi using the local neighbourhood of each point, and the fitted values are connected in a non-parametric regression curve. When fitting such a local regression, a fixed proportion of the data are included in each given local neighbourhood, which is called the span of the local regression smoother (or the smoothing parameter), and the data points within each window are weighted by a smooth function; the weights decrease as the distance from the centre of the window increases.

The non-parametric fitted curves in Fig. 1 were produced by estimating a local polynomial regression of the government expenditure GDP ratio on the log of real per capita GDP for all observations, and using a second-order polynomial, a Gaussian weighting scheme, and two different bandwidths (66% and 33%, as shown in the top and bottom panels of Fig. 1 respectively).Footnote 11 Both panels of Fig. 1 show a clear positive non-linear relationship between the two variables. At low levels of income, the positive relation displays a moderate trending pattern, while at intermediate levels, this is at values between 8 and 10; a considerable increase becomes evident followed by a change in the slope at the highest levels. Nevertheless, when the span is reduced from 0.66 to 0.33 the non-linearity of the relationship is amplified, with the final part of the curve sloping upward.

This approach implicitly hypothesises equivalence between all pairs of observations, regardless of the characteristics of any specific country (e.g. De Benedictis et al., 2008, 2009), and, in this sense, may reflect the relationship between government size and GDP for an average country. The pattern of this relationship during the process of development may be used to test some of the theories proposed to explain the growth of public expenditure during the past century, such as the development models of public sector growth and Wagner’s law. Indeed, the stage of development at which a country stands is a crucial consideration for the validity of Wagner’s law. Our findings seem to be consistent with Wagner’s law, since an expanding government accompanies social progress and rising incomes.

In order to assess the relevance of the average-country hypothesis, in Fig. 2 we provide evidence for the applicability of Wagner’s law for each individual country using the long-term components (S4) of the expenditure GDP ratio and per capita GDP extracted by wavelet multilevel decomposition. Two distinct periods may be clearly distinguished, with the break occurring at a log value of per capita GDP of around 8.5: in the first period, the expenditure-GDP ratio grows moderately, with countries displaying a heterogeneous pattern; in the second period, countries show a strong homogeneous pattern, with the ratio first increasing and then decreasing at the highest income values.

Fig. 2
figure 2

Wagner’s law for each country using the long-term components of the expenditure GDP ratio and per capita GDP (1881–2018)

4 Co-integration analysis

Since it is generally assumed that Wagner was referring to the trend in the ratio of government expenditure to GDP,Footnote 12 this specification is commonly interpreted as a long-run relationship. Thus, in the recent literature, co-integration analysis has become the standard approach to investigating the existence of a long-run relationship between economic development and government activity by testing for both the existence of a long-run relationship and the sign, size, and statistical significance of the coefficient of interest (Durevall & Henrekson, 2011; Magazzino, 2012; Shelton, 2007).

Based on the findings presented in Fig. 2, we decided not to run panel data co-integration analysis, but instead to run co-integration tests for each country.Footnote 13 Given that we only had two variables, and thus there could not be more than one co-integrating vector, we applied the Engle–Granger (1987) procedure to test for co-integration.Footnote 14 Moreover, since non-parametric analysis indicated several shifts in the relationship between government expenditure and real per capita GDP, we also used the Gregory–Hansen (1996) procedure that allows the presence of a structural break at an unknown date to be tested for. Both tests use the null hypothesis that there is no co-integration, but the alternative in the Gregory–Hansen test is co-integration with or without a structural break. When both tests rejected the null hypothesis, the presence of co-integration without a structural break was assessed, but if only the Gregory–Hansen test rejected the null hypothesis could we conclude that there was co-integration with one structural break.

Table 1 reports, for each country over the period 1880–2018, the estimates of the Engle–Granger co-integrating regression (column β) and co-integration augmented Dickey–Fuller (ADF) test statistics (column ADF) along with the Gregory–Hansen co-integration break test (column GH) for a full regime shift, that is, a change in both the intercept and coefficient of the explanatory variable. The co-integrating regression shows that the estimate of elasticity, β, is always positive, with most of the coefficient values being close to unity; exceptions include Italy and Spain, with values clearly below unity, and Switzerland, whose value is more than double. The Engle–Granger co-integration test statistics reject the null hypothesis of no co-integration only for Canada, Germany, and the USA (at the 5% level). The break dates are of interest where the ADF test is nonsignificant and the GH test is significant, that is, for Australia, Denmark, Finland, Portugal, and Sweden (at the 1% level), and for Belgium, Great Britain, Japan, and Norway (at the 5% level). All in all, even when allowing for a shift of the relationship, the full sample estimate seems to be supportive of Wagner’s law for only a limited number of countries.

Table 1 Engle–Granger co-integrating regression for the 1880–2018 period Government/GDP = α + β (per capita GDP)

Since the time span covered by our dataset includes several war years and crisis periods, it was found to be useful to split the sample into two subsamples corresponding to pre- and post-WWII periods. Moreover, we established a post-1975 subsample, which roughly corresponds to the time span most often used in recent empirical literature.Footnote 15

Table 2 reports the estimates of the Engle–Granger co-integrating regression and the co-integration ADF test statistics using three different subsamples: 1880–1939, 1951–2018, and 1975–2018. Individual countries’ coefficients differ significantly between pre- and post-WWII periods. Thus, even when similar over the whole sample, they are the ‘result’ of very different stories. Consider, for example, the coefficients for Denmark and Japan. Both are 1.1 in the 1880–2018 period, but while Denmark has a low value (0.42) for the pre-WWII period and a higher value (1.25) for post-WWII, Japan displays the opposite behaviour: high in the pre-WWII period (1.32) and low after WWII (0.35). Interestingly, and consistent with the results of non-parametric analysis, there is clear evidence of a significant reduction in the size and variability of the estimate of the elasticity β from the pre-WWII period to the post-mid-1970s period: the average magnitude decreases from 0.97 to 0.77 and the variability from 1.08 to 0.38. Moreover, the Engle–Granger co-integration test statistic is mostly nonsignificant, the exceptions being Australia, Belgium, France, Great Britain, Japan, and Norway in the pre-WWII period, and only Germany in the post-WWII period.

Table 2 Co-integration test for different sub-periods

Consistent with the results reported in the recent empirical literature on Wagner’s law, after the 1970s the estimated coefficients suggest a completely different picture. The estimates of the elasticity β decrease, becoming negative in many casesFootnote 16 thus implying that the expenditure share decreases as GDP grows, with the positive parameters being generally very low (exceptions include France and Spain, with values close to unity, and, to a lesser extent, Portugal and Sweden, with values greater than 0.4). This is in striking contrast to what might be expected based on Wagner’s hypothesis.

5 The time-varying pattern of the long-term relationship

Based on the evidence provided by both non-parametric and parametric analyses of the long-term pattern of the relationship between government expenditure and per capita GDP, we investigated the time-varying pattern of the relationship using a rolling panel regression framework. This parametric method consists of estimating a rolling or centred moving regression using a shifting fixed window size, and allowed us to identify the time-varying pattern of the long-term relationship between government expenditures and output.

Figure 3 shows the evolution over time of the rolling panel fixed-effect regression coefficient between the long-term components of the government GDP ratio and per capita GDP, estimated with different fixed window sizes set to 25, 30, and 35 years. The windows were moved forward by one observation so that we could plot a continuous line of estimated coefficients for the relationship between government/GDP and per capita GDP. The coefficient lines are surrounded by a confidence band set at 1.96 times the standard error.

Fig. 3
figure 3

Rolling panel fixed effects regression coefficient of the government expenditure ratio on per capita GDP using fixed window sizes of 25, 30, and 35 years

The main results are threefold. First, long-term elasticity changes continuously and considerably during the period examined. Three periods of increase in the estimated coefficient are clearly detected—one before each world war and one in the 1960s—followed by a considerable decrease in long-term elasticity since the 1980s.18 Second, the up and down swings tend to occur around an average elasticity not far from unity in the first part of the sample, a value that is almost halved after the 1960s. Third, the confidence bands shrink progressively over time, a result consistent with the estimates reported in Table 2 using different subsamples.

6 Absolute or relative long-term growth rate pattern?

The measure that researchers usually look at when studying Wagner’s law is government spending as a share (percent) of GDP. However, the same long-term evolution of the government expenditure GDP ratio may be produced by two completely opposite patterns: in fact, government size can grow in relative terms either because government growth has accelerated relative to GDP growth or because of a decline in the rate of growth of GDP relative to government expenditure. Therefore, in Figs. 4 and 5 we present, respectively, the first two moments of the growth rate pattern of the government expenditure GDP ratio as well as its components—that is, absolute government expenditure and GDP, both in real terms.Footnote 17

Fig. 4
figure 4

Long-term growth rates in government expenditure-GDP ratio (top panel), real government expenditure (middle panel), and real GDP (bottom panel) for each country along with average values (solid lines) from 1881 to 2018

Fig. 5
figure 5

Standard deviation of the long-term growth rates in government expenditure-GDP ratio (dashed line), real government expenditure (dotted line), and real GDP (solid line) from 1881 to 2018

The upper panel of Fig. 4 shows notable differences between countries in the long-term component of the growth rate of the government expenditure and GDP in real terms throughout the sample.Footnote 18 These differences provide us with evidence on the variation in the long-term growth rate pattern of the relative government spending of each country.Footnote 19 Two main findings are in evidence: the wave-like pattern of the long-term growth rate of the government expenditure/GDP ratio (Crowley, 1970), and the break in the cross-country pattern. In particular, the high degree of cross-country heterogeneity typical of the pre-1960s period is followed by a period of increasing synchronisation between countries in terms of the long-term pattern of the growth rate of relative government spending, in coincidence with the rapid growth of the public sector which was mainly driven by the post-war expansion of the welfare state (from 1960 to 1980).

The high degree of cross-country heterogeneity in the long-term patterns in the first part of the sample made the detection of a common pattern a very difficult task. Thus, in each panel of Fig. 4 we have drawn a thick line representing the cross-country average. Three long swings can now be clearly detected with peaks occurring in the 1910s, late 1930s, and early 1970s. The first two expansionary waves precede each of the two world wars and correspond to periods of pre-war armament booms. The third, covering a period of 30–35 years following WWII, coincides with the stimulus given to the economy by governments involved in the reconstruction effort after WWII which culminated in the boom of the 1960s. Finally, following the peak of the early 1970s, a reduction in the public expenditure growth relative to GDP culminates in the trough of the early 1990s and is followed by a small increase in the first decade of the twenty-first century.

Furthermore, we split and observed separately the growth rates of two components of the expenditure/income ratio. The middle and bottom panels of Fig. 4 show, for each country, the long-term pattern of the real growth rate of government expenditures and GDP, respectively, with the thick line representing the cross-country average historical trend of the two variables. Two things are worth noting. The first is the strong resemblance between the long swings in the growth rate of public expenditure in absolute and relative terms.Footnote 20 By contrast, the long-term pattern of the GDP growth rate displays quite a uniform pattern over the whole sample, with a unique large peak in the early 1960s.

Therefore, the growth rate of government expenditure in absolute terms may be considered the key variable for capturing long-term variations in the public sector share of GDP. The second most important variable, measured by the standard deviation of the long-term growth rate components and displayed in Fig. 5, is the striking reduction in the cross-country heterogeneity of the long-term trend patterns. This reduction, more evident for absolute government expenditure than for GDP, starts in the post-WWII period for government spending and after the 1960s for real GDP. Other differences include the greater variability of the long-term growth rate pattern of government spending with respect to its GDP counterpart,Footnote 21 and the increasing divergence among the GDP growth trends of different countries at the beginning of the twenty-first century. This last finding represents an inversion of the tendency to higher synchronisation that characterised GDP patterns in the second half of the twentieth century, and the timing seems to be consistent with the heterogeneous effects of globalisation on the economic growth of individual countries.

To summarise, visual inspection of the long-term pattern of the growth rate of government expenditures, both individually and in relation to GDP, suggests several interesting findings: (i) there is evidence consistent with Wagner’s law in absolute terms; (ii) notwithstanding considerable cross-country heterogeneity until the late 1950s, several expansionary waves can easily be identified in the two pre-war armament booms prior to each world war and in the ‘golden age of government expenditure’; (iii) the degree of cross-country heterogeneity in the trending pattern of absolute and relative spending, which is high in the pre-WWII period, decreases considerably in the post-WWII period (as evidenced by the reduction of the confidence band over time in Fig. 3 and the convergent pattern displayed in the final part of each panel of Fig. 4).

7 Discussion and conclusion

The empirical evidence provided in this paper may have interesting implications for the reconciliation of the contrast between the variety and plurality of theoretical approaches and the consensus that seems to have emerged when applying the most recent statistical and econometric methods to the post-WWII period. Our main findings may be summarised as follows. First, there is a long-term positive relationship between government size and per capita income that emerges clearly from the results of the different methods applied in this study, i.e. non-parametric regressions (Fig. 1), co-integration analysis (Table 1), and rolling regressions (Fig. 3). Second, the positive pattern is strongly non-linear, as it is evident from the analysis of the relationship between the long-term components (Fig. 2) and the results of co-integration analysis for historical sub-periods (Table 2). Moreover, as a further and specific aspect of the non-linearity just discussed, we may observe a strong weakening, and even a possible inversion, of the relationship at very high levels of development (first panel of Fig. 1; more clearly in Fig. 2) and in later years, as signalled by several negative coefficients in the co-integration of the 1975–2018 period (Table 2). The inversion of the sign of the relationship is probably responsible for the fall to zero, in the very final years, of the coefficients of the rolling regressions (Fig. 3). Third, three long waves are detected when we consider the whole sample period: two which correspond to the world wars, and the last in the post-WWII period. These waves may be easily detected by rolling regression analysis (Fig. 3), and by the long-term components of growth rates (Fig. 4). Lastly, we find an increasing synchronisation of the ‘cross-country’ pattern which is confirmed by the fall in the standard deviation of the coefficients of the co-integration analysis (Table 2), the progressive shrinking of the confidence bands of the rolling regressions (Fig. 3), and the reduction, at the end of the sample, of the standard deviation of the long-term components of the growth rates, for both absolute and relative (to GDP) government expenditure (Fig. 5).

Theories aiming to explain the long-term evolution of relative government spending are required to capture this articulated set of results. While some of the theoretical hypotheses highlighted in the introduction are clearly useless for the interpretation of our results (but possibly consistent with other perspectives, e.g. Alesina & Wacziarg, 1998), our view is that, although no single theory can satisfactorily account for the complex historical process of government spending, a combination of them might.

Wagner’s initial hypothesis, together with some qualifications of it (North, 1991, 2003; Rodrik, 1998), can address the long-run positive relationship and, according to some interpretations, its non-linearity given the relative weakening of the relationship at later stages of development. A parallel, and not necessarily conflicting, interpretation is Musgrave's (1969) hypothesis about the link between the non-linear shape of the relationship and different needs at different stages of development. Finally, Baumol’s (1967) disease, based on the lack of growth of productivity in public services, could also provide a partial explanation of the positive long-run relationship.

Nonetheless, none of the previous theoretical approaches can explain the weakening of the relationship, the presence of long waves, and the increasing synchronisation pattern. The ‘ratchet effect’, that is, the view that government expenditure tends to evolve in a step-like pattern represented by an acceleration of the growth rate occurring around periods of social upheaval (Besley & Persson, 2008; Bird, 1971, 1972; Higgs, 1985; Peacock & Wiseman, 1961), is consistent with the pre-WWII evidence, which is characterised by expansionary long waves corresponding to pre-war armament boom and war periods, and heterogeneity in the pattern of government spending. However, the ‘ratchet effect’ cannot explain the third wave of the post-WWII period, nor is it useful for the rest of our evidence.

The third wave, along with the weakening of the relationship, may be clarified to some extent by a mix of two contributions. First, Peltzman’s (1980) idea that strong redistributive policies were the main source of the growth of government in developed economies after WWII, due to the growth of some social and political groups, such as the ‘middle class’. Second, Acemoglu and Verdier’s (2000, p. 195) hypothesis, according to which ‘in richer economies, the productivity in the private sector is higher, so, the opportunity cost of government intervention is also greater’. This paper proposes an interesting interpretation of the role of both market failures and state failures in society, along with the possible social reactions to them, which we consider useful for providing a general view of the features of the evolution of government size in recent decades. Since the third long wave occurs in coincidence with the evidence for the country synchronisation pattern, theories that emphasise individual country-specific features are unlikely to provide complete interpretation of the post-WWII evidence and valid explanations for the strong convergent pattern in government spending observed in almost all countries from the 1960s.Footnote 22 Hence, theories that refer to the general motivation for public intervention in the economy are natural candidates.

In the post-war era, ideology (i.e. Keynesianism) may have been the root cause of the expansion of government driven by the welfare state. The rapid expansion in the absolute and relative size of the public sector between 1950 and 1970 was the consequence of a positive attitude towards an interventionist role for government with respect to market failures in industrialised economies and the introduction and/or expansion of welfare state policies (Tanzi & Schuknecht, 2000). Following the post-war expansion of the welfare state, the emergence of the public choice school (Buchanan, 1975) reflected an increasing awareness of the limitations of government when correcting the failures of the market. Government failure theorists raised doubts about the efficacy of optimal public policies to correct market failures on the grounds that such policies may create costs and inefficiencies considerably greater than that of market failure. This favoured a rethinking of the appropriate role of government, with regulatory activities replacing direct production of services and price controls within the objectives of public policy. After this shift in ideological belief on the role of government (Tanzi, 2011), the view that emerged was reflected in the slowing down of absolute and relative growth in public spending and GDP which can be observed in all countries. Since the mid-1970s, government expenditure kept growing at lower pace compared with the 1970s in almost every sampled country, with the growth rate of public spending now stabilised within the 0–2% range and still slowing.Footnote 23 Although our interpretation is based on conjectures rather than testable hypotheses, we believe that the shift in ideological belief on the role of government from a focus on market to government failures appears to be the most plausible explanation for the actual general downward tendency of government expenditure growth.Footnote 24

Many authors have suggested that government size is linked to citizens’ willingness to pay taxes, something that is likely to depend on subjective, collective, political, and historical features. For instance, Acemoglu and Verdier (2000) suggest that we should expect a decline in the optimal size of government ‘unless the extent of market failures increases relatively rapidly’.

In recent decades, two types of market failure have emerged which are likely to worsen in the near future, especially the latter: the upsurge of income inequality, and global warming. Together with the present pandemic, they represent challenges that demand a global collective response because of the huge social costs associated to them. It is thus possible that these global events may pave the way for another change in the prevailing ideology—a turn to government, in the sense of a ‘shift or toleration for a wider scope of effective governmental authority over economic decision-making’ (Higgs, 1985).

Almost 40 years after the famous declaration by the American president Bill Clinton in his 1996 State of the Union Address that ‘the era of big government is over’, can it now be said that, as Krugman recently titled an article in The New York Times (11 March 2021), ‘the era of “the era of big government is over” is over’?

8 Appendix 1: Dataset

Data for government expenditure and per capita income were taken from different sources. Per capita income data were taken from the Maddison Project Database 2018 (available at https://www.rug.nl/ggdc/historicaldevelopment/maddison/releases/maddison-project-database-2018). For government expenditure data, several different sources of historical data were available. We relied on the long-term IMF dataset Public Finance in Modern History (based on Mauro et al., 2015), available at this address: http://www.imf.org/external/datamapper/index.php. The dataset refers to ‘general’ government expenditures and potentially covers all world countries for the period 1800–2018, although data from the nineteenth century are available for only a few of them. We used 1880 as the initial year of the study period, as data for periods prior to this year are scarce. The dataset was updated to 2018 using data from IMF Fiscal Monitor. Population data are taken from the Jordà–Schularick–Taylor (JST) Macrohistory Database, available at: http://www.macrohistory.net/data/. Although data for 21 countries from 1880 onwards are available, for robustness purposes we selected a subset of 17 countries—those included in the JST database—because their data on public expenditure refer to ‘central’ rather than ‘general’ government expenditure. The complete list of countries includes Australia (AUS), Belgium (BEL), Canada (CAN), Germany (DEU), Denmark (DNK), Finland (FIN), France (FRA), Great Britain (GBR), Italy (ITA), Japan (JPN), Netherlands (NLD), Norway (NOR), Portugal (PRT), Spain (ESP), Sweden (SWE), Switzerland (CHE), and the United States (US).

The historical datasets suffered from missing data, mainly concentrated in war years. Table 1 provides the list of countries and their associated missing data in our dataset. The treatment of missing data is explained in the last part of Sect. 2 in the main text.

9 Appendix 2: Some basic concepts on wavelet analysis

The wavelet transform provides a time–frequency representation of the signal using a set of orthogonal basis functions, named wavelets. The base of the wavelet transform, the wavelet, is designed on the basis of some desired properties associated with that function, admissibility and regularity conditions. According to the admissible condition, the wavelet must oscillate so that its mean value is equal to zero:

$$ \int\limits_{ - \infty }^{\infty } {\psi (u){\text{d}}u = 0} $$

According to the regularity condition, the wavelet has exponential decay so that it must oscillate, and is localised in the sense that it decreases rapidly to zero as t tends to infinity:

$$ \int\limits_{ - \infty }^{\infty } {\psi (u)^{2} {\text{d}}u = 1} $$

All the wavelet functions used in the transformation are generated from a basic wavelet function ψ(u), called the mother wavelet, through translation (shifting) and scaling (dilation or compression) of the mother wavelet. The mother wavelet, defined as:

$$ \psi_{s,u} (t) = \frac{1}{\sqrt s }\psi \left( {\frac{t - u}{s}} \right) $$

is a function of two parameters s and u. The translation or location parameter u indicates where the wavelet is centred along the signal and where it is shifted through the signal. Thus, it corresponds to the time information in the wavelet transform. The scaling or dilation parameter s controls the length of the wavelet and is defined as the inverse of frequency and corresponds to frequency information. Scaling either dilates (expands) or compresses a signal. Large scales (low frequencies) dilate the signal and provide detailed information hidden in the signal, while small scales (high frequencies) compress the signal and provide global information about the signal.

A parsimonious representation of the evolution over time of the periodic components of a signal is provided by the discrete wavelet transform (DWT), which uses only a limited number of translated and dilated versions of the mother wavelet to decompose the original signal in such a way that the information contained in the signal can be summarised in a minimum number of wavelet coefficients. The DWT proceeds by the pyramid algorithm. Applying the DWT reduces the original N data points into two series of length N/2: one of these contains the smoothed information22 and the other contains the detailed information. By keeping the details and doing an additional transform of the smoothed series we can produce two series of length N/4, with smoothed and detailed information, and so on. If the original time series was some power of 2, N = 2 J, then the number of coefficients at the end would total N, and would contain all of the information in the original time series, organised according to scale and location, the number of coefficients at each scale being:

$$ N \, = \, N/2^{J} + \, N/2^{J} + \, N/2^{J - 1} + \cdots N/4 \, + \, N/2 $$

The deconstruction of the function f(t) is therefore:

$$f\left(t\right)\approx {\sum }_{k}{s}_{Jk}{\phi }_{Jk}\left(t\right)+{\sum }_{k}{d}_{Jk}{\psi }_{Jk}\left(t\right)+\dots .+{\sum }_{k}{d}_{jk}{\psi }_{jk}\left(t\right)+\dots .+{\sum }_{k}{d}_{1k}{\psi }_{1k}\left(t\right)$$

with N/2 J sJ,k coefficients, N/2 J dJ,k coefficients, N/2 J−1 dJ−1,k coefficients … and N/2 d1,k coefficients. Further, the approximation can be rewritten in terms of collections of coefficients at given scales as:

$$f\left(t\right)\approx {S}_{J}+{D}_{J}+\dots +{D}_{j}+\dots +{D}_{1}$$

where SJ contains the ‘smooth component’ of the signal, and the Dj, j = 1, 2,..J, the detailed signal components at ever increasing levels of detail. SJ provides the large-scale road map, while D1 shows the pot holes. The previous equation indicates what is termed the ‘multi-resolution decomposition’, where a signal is decomposed into several components, each associated with a different frequency band, with a resolution matched to its scale. Specifically, by using multi-resolution technique by which different frequencies are analysed with different resolutions the wavelet transform gives good time resolution and poor frequency resolution at high frequencies, and good frequency resolution and poor time resolution at low frequencies.

By sequentially adding the detail level components D4, D3, D2 to the lower ‘smooth’ component S4 we obtain three additional levels of approximation: S3, S2, and S1. The higher the index, the smoother the function: S1 captures fluctuations greater than 4 years, S2 greater than 8 years, and S3 greater than 16 years. Table 3 presents the frequency domain interpretation in terms of periods for each detail and the approximation where annual data were used.

Table 3 Presence of missing data by country

See Table 4.

Table 4 Frequency interpretation of detail and approximation levels

In practical applications, the maximal overlap discrete wavelet transform (MODWT) is used instead of the DWT. The MODWT is a compromise between the continuous wavelet transform (CWT), with continuous variations in scale, and the DWT which uses only a limited number of translated and dilated versions of the mother wavelet to decompose the original signal. The MODWT is highly redundant so that the transformations at each scale are not orthogonal, but the offsetting gain is that applying the transform leaves the phase invariant—a very useful property when analysing transformations—and the transform is not restricted to limitations imposed by the dyadic expansion used by the DWT. Indeed, because of the practical limitations of the DWT, wavelet analysis is generally performed by applying the MODWT, a non-orthogonal variant of the classical discrete wavelet transform that, unlike the DWT, is (i) translation-invariant, as shifts in the signal do not change the pattern of coefficients, (ii) can be applied to datasets of length not divisible by 2J, and (iii) returns at each scale a number of coefficients equal to the length of the original series.