1 Introduction

The idea that a macroeconomic model of consumption should allow for the direct effect of government consumption in private utility is standard (Kormendi 1983; Evans and Karras 1996; Amano and Wirjanto 1997, 1998; Fève et al. 2013; Bouakez and Rebei 2007; Leeper et al. 2017; Ganelli and Tervala 2009; Fiorito and Kollintzas 2004). Here, whether private and public consumption are Edgeworth complements or substitutes becomes critical in mediating fiscal policies involving changes in government consumption. Specifically, through a positive marginal utility channel, an increase in government consumption can directly increase private consumption if the two consumption goods are Edgeworth complements. This is particularly important because if the degree of complementarity is high enough, the positive marginal utility of private consumption can offset, and possibly outweigh the standard negative wealth effect arising from financing the increase in public consumption with taxes. In contrast, Edgeworth substitutability between private and public consumption generates the opposite effect, suggesting that cuts in public consumption can induce a demand-side offset that can further lead to some moderation in the impact of fiscal consolidation on output.

This paper provides a fresh look into the empirical relationship between private and public consumption in household utility by estimating the intratemporal elasticity of substitution for 17 European countries over the period 1970–2018. In the context of the CES utility function, and for a given intertemporal elasticity of substitution, the intratemporal elasticity of substitution (IES) between private and government consumption plays a central role in determining whether the two goods are Edgeworth complements or substitutes.Footnote 1 However, existing studies that estimate the relationship between private and government consumption in the panel data framework have done so under the “strong” assumption of cross-sectional independence across countries (e.g., Fiorito and Kollintzas 2004; Kwan 2009; Ho 2001; Dawood and Francois 2018; Brown and Wells 2008; Jalles and Karras 2021). Specifically, these studies have neglected the presence of cross-sectional dependencies that may exist across countries either due to global shocks (e.g., oil price shocks and the ongoing COVID-19 pandemic, which calls for almost synchronized policy actions across the world) or economic spillovers (e.g., provision of public goods in neighboring countries, economic integration or fiscal policy coordination between countries (e.g., Banerjee and Carrion-i Silvestre 2017)). Neglecting these dependencies can lead to a breakdown of crucial assumptions for standard panel estimators employed in existing cross-country studies, which can induce biased estimates and spurious inferences (Chudik et al. 2017; Eberhardt and Teal 2020).

Additionally, the cointegration techniques employed in existing studies for both time series and panel data provide very little guidance on model specification in uncovering the IES. More precisely, with a common underlying theory to motivate the estimation, some studies have used the price of the consumption goods as the regressand (e.g., Amano and Wirjanto 1998) while other studies instead employ consumption as the regressor (e.g., Kwan 2009; Ho 2001; Dawood and Francois 2018). The choice of regressor and regressand is mixed across studies partly because the intratemporal equilibrium condition from the model equates the consumption ratio to their inverse price ratio. The researcher can therefore choose any of the two variables as the dependent variables to recover the IES, and in theory, one should expect to recover the same IES regardless of the choice of dependent variable. Unfortunately, existing literature does not provide any evidence to show whether the choice of one dependent variable over the other can impact the size of the IES due model misspecification and hence, lead to incorrect inference. Indeed, Ng and Perron (1997) show that the choice of variable to put on the left-hand side as the regressand affects the precision of estimates and can lead to drastically different point estimates; hence, a careful selection of the regressand is required. Importantly, the prior literature that employs panel data and cointegration techniques fails to provide detailed/useful post-estimation diagnostics of well-behaved residuals to ensure that models are not misspecified and hence, the estimates of the IES valid.

In this paper, we explore the use of panel cointegration techniques and recently developed Cross-section augmented distributed lag (CS-DL) estimators in uncovering the IES while accounting for the aforementioned shortcomings in the existing literature. Specifically, for the empirical work, we remain agnostic by not selecting a fixed set of assumptions a priori. Consequently, in the baseline analysis we estimate the IES under: (1) different choices of estimators, (2) different model specifications in that the regressor and regressand in the cointegration relation are either chosen as the price or consumption ratio, and (3) cross-sectional dependence versus independence. Furthermore, as a means of dealing with cross-sectional dependence, we employ two strategies—one that implements the time-demeaned variable approach á la Sarafidis et al. (2009), Herzer and Morrissey (2013) and Francois and Keinsley (2019), and an alternative that augments cross-sectional averages of all the variables to the estimation as in Pesaran (2006), Holly et al. (2010), and Eberhardt and Teal (2020). Avoiding a preset of assumptions allows us highlight how models in existing studies might have been misspecified. We therefore provide systematic post-estimation diagnostics in the manner of Eberhardt and Presbitero (2015) and Chudik et al. (2017) to test these sets of assumptions, evaluate model specifications, validate the estimated IES. These features are not shared with other studies that estimate the IES (Kwan 2009; Ho 2001; Dawood and Francois 2018; Brown and Wells 2008; Amano and Wirjanto 1997).Footnote 2

The econometric methodology considered in this paper is based on a cointegrating regression model interpreted as an equilibrium condition in an optimal fiscal policy setting. The CES aggregator function is used to define effective consumption, which is a function of private and public consumption. The following results emerge from the baseline estimations: First, all estimates of the IES support a direct role of public consumption in private utility, suggesting that public consumption is utility enhancing in European countries. These results are consistent with findings in the literature (see, Leeper et al. 2017; Fiorito and Kollintzas 2004, for instance). Second, depending on the set of assumptions made prior to estimation, the estimated IES can range from as low as 0.3 to as high 3.3, implying gross complementarity and substitutability between private and public consumption for the same dataset. Specifically, in the estimation where the consumption ratio between private and public consumption is used as the regressand, we find an estimated IES that is between 0.297—0.901. These estimates consistently uncover that government and private consumption are gross complements. In contrast, in the case where the price ratio of the consumption goods is employed as the regressand, we find estimates of the IES that are consistently greater than 1 implying gross substitutability between private and public consumption. Fourth, post-estimation diagnostics in the error structure of the regressions, which is often ignored in this literature, uncover that estimates from the consumption ratio regressions outperform the case where the price ratio is employed as the regressand, and that the most reliable estimates of the IES lie between 0.6 and 0.74. Importantly, this value, when combined with the relevant intertemporal elasticity of substitution, implies Edgeworth complementarity between private and public consumption—a finding that supports studies such as Fiorito and Kollintzas (2004) and Jalles and Karras (2021). Interestingly, these set of values of the IES are able to predict a positive response of private consumption in standard RBC, which coincides with results from empirical models (e.g., Leeper et al. 2017; Laumer 2020; Ilzetzki et al. 2013). Finally, we recognize that point estimates from the panel analysis can mask potential cross-country heterogeneity in the IES. Consequently, we carry out a country-by-country analysis. The results from the heterogeneous panel estimates reveal that the IES can range from as high as 1.2 in Ireland to as low as 0.3 in Italy. The variation observed in the size of the estimate is positively correlated with the share of health spending in total government expenditure but negatively associated with the share of public order expenditure in total government spending. We also find a U-shape relationship between the size of IES and government size.

The focus on European economies is particularly important and timely in that with weak economic growth in Europe, the idea is being resurrected that European governments should pursue fiscal stimulus by increasing spending—i.e., public consumption and investment spending.Footnote 3 The findings in this paper shed light into the effectiveness of fiscal policy in these economies by considering the marginal utility channel of public consumption. That is, the results that public and private consumption are Edgeworth complements in European countries suggest that expansionary fiscal policy that targets increasing government consumption can induce Keynesian effects. This is because with Edgeworth complementarity, an increase in government consumption raises the marginal utility of private consumption. If this increase in marginal utility is strong enough, it can outweigh the traditional wealth effect that arises from financing the increase in public consumption. Furthermore, given that current monetary policy in European economies is accommodative of fiscal policy at least in the foreseeable future, these Keynesian effects produced by the marginal utility channel of public consumption are likely to be amplified. Additionally, the findings in this paper suggest that fiscal stimulus packages that comprise large government consumption components may be effective at stimulating aggregate demand, reinforcing similar conclusions in the literature (see for instance, Boehm 2019).Footnote 4 In a symmetric thinking, the results suggest that with public and private consumption being Edgeworth complements, fiscal consolidation can be self-defeating. This finding is isomorphic to conclusions in Bandeira et al. (2018), who show that fiscal consolidation in the form of cuts in productive public good can be contractionary.

The rest of the paper is organized as follows: Sect. 2 presents some stylized facts on to the empirical- and model-based response of private consumption following an increase in government consumption with emphasis on the relevance of the IES. Section 3 motivates the empirical analysis with a simple theoretical model interpreted as an equilibrium condition in an optimal fiscal policy setting. Section 4 presents the econometric methodology. Section 5 discusses the baseline results and their implications. Section 6 explores the possibility of heterogeneity in the estimated values. Section 7 concludes.

2 Some stylized facts

In this section, we perform two simple but important exercises to highlight some stylized facts. First, we present the empirical response of private consumption to an increase in public consumption using panel vector autoregression (VAR) model. We apply the panel data of the 17 European countries over the period 1970–2017. Second, we illustrate the relevance of the size of the IES in replicating (at least in a qualitative manner) the empirical response of private consumption to an increase in public consumption. We employ a calibrated real business cycle (RBC) model for this exercise. In both cases, we present the results via an impulse response of private consumption to an exogenous increase to public consumption.Footnote 5

2.1 What Does The Data Say?

Here, we provide a simple exercise to highlight the response of private consumption to a government consumption increase. Uncovering this empirical response would help us appreciate the implications of the estimated intratemporal elasticities in Sect.  4, which is the main aim of the paper. To this end, we consider a reduced-form bivariate panel VAR of order p with panel-specific fixed effects represented by the following system of linear equations,

$$\begin{aligned} {\textbf{Y}}_{it} = {\textbf{u}}_{i}+\sum _{j = 1}^{p} {\textbf{Y}}_{it-j}{\textbf{A}}_{j} + {\textbf{e}}_{it} \end{aligned}$$
(1)

\( {\textbf{Y}}_{it} \) is a \((1 \times k)\) vector of variables with \(k=2\) and it comprises the stationary versions of data on government consumption and private consumption for a given year t and country i. \({\textbf{A}}_{j}\)’s are (\(k \times k\)) matrices to be estimated and \({\textbf{u}}_{i}\) and \({\textbf{e}}_{it}\) are \((1\times k)\) vectors of dependent variable-specific panel fixed-effects and idiosyncratic errors, respectively. In the manner of Holtz-Eakin et al. (1988) and Abrigo and Love (2016), we assume that individual countries in the panel share the same underlying data generating process, with the reduced form parameters \({\textbf{A}}_{j}\)’s common across countries. Systematic cross-sectional heterogeneity is modeled as panel-specific fixed effects. The estimation of the parameters of the system in (1) is conducted in a generalized method of moments framework (see, Abrigo and Love 2016, for details).Footnote 6 To recover the structural shocks from the VAR innovation in this simple bivariate setting, we adopt the recursive Cholesky decomposition identification, where government consumption is ordered first.

Fig. 1
figure 1

Response of Private Consumption to an increase in Government Consumption.Notes: The shaded grey area is the 68% confidence interval. The lag order p is set to two according to the modified Akaike information criteria and Bayesian information criteria. The first four lags of the endogenous variables in the system are used as instruments

The response of private consumption to an increase in government consumption via impulse response function is depicted in Fig. 1. It is evident from Fig. 1 that an exogenous increase in government consumption increases private consumption, and this positive response is statistically significant. Although the results in Fig. 1 provide a preliminary relationship, the positive response of private consumption is mostly in line with studies such as Ilzetzki et al. (2013), Jha et al. (2014), Kilponen et al. (2019), and Laumer (2020) who find similar response of private consumption following an increase in government consumption.

2.2 How Relevant Is The IES?

The findings in the previous section and evidence in the empirical literature reinforce the idea that a positive consumption multiplier is not uncommon.Footnote 7 However, this increase in private consumption following an increase in government consumption is at odds with predictions from neoclassical macroeconomic theory. Specifically, in standard RBC models, an increase in public consumption induces a negative wealth effect from financing the rise in government consumption with taxes, which reduces the household’s permanent income (Baxter and King 1993). This drop in income forces household’s to increase their labor supply; however, this positive labor supply is not strong enough to offset the negative wealth effect, and consumption falls in equilibrium. Additionally, although standard New Keynesian models improve the negative wealth effect, they do not generate positive consumption multipliers.Footnote 8 To reconcile model-based predictions to empirical consumption multipliers, the New Keynesian framework, which features sticky prices, is extended to include non-Ricardian households (Galí et al. 2007).

The introduction of the non-Ricardian households breaks the standard Ricardian equivalence present in the baseline RBC model since this fraction of household agents cannot borrow or save and hence, consume all their current income. Consequently, because wages rise after an increase in government consumption, these non-optimizing households raise their consumption. If the fraction of non-Ricardian households is large enough, this can generate a net positive response for consumption following the increase in government consumption (See, Leeper et al. 2017).Footnote 9

An additional, and perhaps, alternative and intuitive transmission mechanism, which does not require a rule-of-thumb household assumption, that has been proposed in the model-based literature is the complementarity between private and public consumption.Footnote 10 Here, when government consumption is a complement to private consumption in an Edgeworth–Pareto sense, an increase in government consumption raises the marginal utility of private consumption and provides additional motives for households to work more. When the degree of complementarity is sufficiently high, this positive marginal utility channel can offset the standard negative wealth effect and induce a positive consumption response. In a model that assumes a CES aggregator form for effective consumption, the intratemporal elasticity of substitution becomes a central parameter in driving the size and even the sign of the consumption response.

Fig. 2
figure 2

Model-Based Response of Private Consumption to an increase in Government Consumption.Notes: Model-based response of private consumption following an increase in government consumption in a standard RBC model with utility-enhancing public consumption and capital adjustment costs. Consumer utility is the standard CRRA with \(U = (C^{e})^{1-1/\gamma }/(1-1/\gamma )\), where the intertemporal elasticity is given as \(1/\gamma = 0.8\) and effective consumption \((C^{e})\) is defined as \(C^{e}_{t} = [\lambda \varepsilon _{t}C_{t}^{1-(1/\theta )}+(1-\lambda )\epsilon _{t} G_{t}^{1-(1/\theta )}]^{1/(1-(1/\theta ))}\). The parameter \(\theta \) governs the IES. \(sgn[U_{C,G}] = [1/\gamma - \theta ]\) governs the Edgeworth complementarity (substitutability) between private and government consumption where \(U_{C,G} >0 (<0)\) represents complementarity (substitutability). The case where \(U_{G,C} = 0\) is the standard neoclassical case where only wealth effect exists. With standard calibration of other model parameters, we still find a positive response of private consumption (on impact) to an increase in government consumption for a value of \(\theta \) up to approximately 0.74

To elucidate the relevance of the intratemporal elasticity of substitution, we embed utility-enhancing government consumption in a basic RBC model. Notice that RBC models are the bedrock of most general equilibrium models and in its pure form, it produces a negative consumption effect in response to an increase in government consumption. Hence, using an RBC model provides a good starting point to highlight the relevance of this simple modification. To this end, the utility function is specified as a standard CRRA utility function with the intertemporal elasticity of substitution defined as \(1/\gamma \) and effective consumption, \(C^{e}\) defined as \( C^{e} =[ \lambda C_{t}^{1-(1/\theta )}+(1-\lambda ) G_{t}^{1-(1/\theta )}]^{1/(1-(1/\theta ))}\), where \(\theta \) is the intratemporal elasticity of substitution, \(\lambda \) is the weight of private consumption in effective consumption and \(C_{t}\) and \(G_{t}\) are private and government consumption, respectively.Footnote 11 We fix the intertemporal elasticity of substitution associated with the utility function to be 0.8 as in Havránek (2015) for this exercise. Havránek (2015) discusses that 33 studies published in the top five general interest journals report an intertemporal elasticity of substitution to be 0.9 on average. However, we employ the more conservative value of 0.8 as Havránek (2015) proceeds to show that calibration above 0.8 is inconsistent with the empirical literature. We present the response of private consumption following an exogenous increase in government consumption in Fig. 2 via impulse response functions.

As can be seen from Fig. 2, for the given intertemporal elasticity of substitution, the response of private consumption is inversely related to the size of the IES. That is, smaller values of the IES that ensure that private and government consumption are Edgeworth complements improves the negative response of private consumption in the standard RBC model. If the degree of complementarity is high enough (i.e., low \(\theta \)), it can completely offset, and even outweigh the standard negative wealth effect inducing a positive response of private consumption. In contrast, when private and public consumption are Edgeworth substitutes the negative wealth effect is reinforced. This is evident in Fig. 2 as the impulse response function associated with larger IES generate a larger negative response of private consumption compared to the standard RBC model. The results from this exercise indicate that estimating the IES accurately from data with the cointegration techniques this paper aims to utilize is important as it can have direct impact on how private consumption responds to changes in public consumption.

3 A simple theory to guide the empirical model

In this section, we lend a structural interpretation to the empirical estimation that follow in Sect. 4. We borrow heavily from Amano and Wirjanto (1998), Kwan (2009), and Dawood and Francois (2018) and assume a representative agent (i.e., social planner) who gains utility from two goods, private and public. The agent’s expected lifetime utility function is governed by Eq. (2) and is subject to stationary preference shocks:

$$\begin{aligned} U_{t}=E_{0}\sum \limits _{t=0}^{\infty } \beta ^{t}u(C^{e}_{t}, \epsilon _{t}, \nu _{t}), \end{aligned}$$
(2)

where u(.) takes the constant relative risk aversion (CRRA) form \(u(C^{e}) = \frac{(C^{e})^{1-\frac{1}{\gamma }}}{1-\frac{1}{\gamma }}\), with \(1/\gamma \) representing the intertemporal elasticity of substitution. Effective consumption \(C^{e}\) is a constant elasticity of substitution (CES) aggregate of private and public consumption:

$$\begin{aligned} C^{e}_{t} = [\lambda \varepsilon _{t}C_{t}^{1-(1/\theta )}+(1-\lambda )\nu _{t} G_{t}^{1-(1/\theta )}]^{1/(1-(1/\theta ))}, \end{aligned}$$
(3)

where the random preference shocks \((\varepsilon _{t}, \nu _{t})\) are strictly stationary with unit means. These stationarity assumptions imply that preferences are stable in the long run. The preference parameters \(\lambda \in [0,1]\) and \(\theta > 0\) represent the relative weight assigned to private goods and the intratemporal elasticity of substitution (IES), respectively. The latter restriction ensures that the standard assumption of convexity of preferences is preserved, a negative value of \(\theta \) therefore violates this preference assumption. An intratemporal elasticity of substitution that is greater (less) than one implies gross substitutability (complementarity) between private and public consumption. When \(\theta \) is equal to zero, the two goods are perfect gross complements. Finally, estimated values of \(\theta \) less than zero are theoretically implausible as they violate standard properties of the consumer utility function (Ogaki et al. 1996). The agent maximizes her utility subject to the budget constraint \(P^{g}_{t}G_{t} + P_t^{c}C_{t} = I_{t}\) where \(I_{t}\) is income.

With the assumption that the agent’s utility function is time-separable, the optimal consumption bundle satisfies an equality condition between the marginal rate of substitution (MRS) and the relevant relative price.Footnote 12 This yields the intra-temporal Euler equation of private versus public consumption for the social planner. Hence, we obtain the condition:

$$\begin{aligned} \frac{\partial U_{t}/\partial G_{t}}{\partial U_{t}/\partial C_{t}} \equiv \frac{\nu _{t}(1-\lambda ) C_{t}^{1/\theta }}{ \varepsilon _{t}\lambda G_{t}^{1/\theta }} = \frac{P^{g}_{t}}{P^{c}_{t}}. \end{aligned}$$
(4)

Taking logs in Eq. (4), we obtain:

$$\begin{aligned} \ln \left( \frac{C_{t}}{G_{t}}\right) = -{\theta }\ln \left( \frac{1-\lambda }{\lambda }\right) +\theta \ln \left( \frac{P^{g}_{t}}{P^{c}_{t}}\right) - \theta \ln \left( \frac{\nu _{t}}{\varepsilon _{t}}\right) . \end{aligned}$$
(5)

As mentioned earlier, stability of preferences implies that the residual term \(-\theta \ln (\nu _{t}/\varepsilon _{t})\) is stationary, and thus that Eq.(5) is a cointegrating regression provided that the log price ratio \(\ln (P_{t}^{g}/P_{t}^{c} )\) and the log consumption ratio \(\ln (C_{t} /G_{t})\) are both I(1) processes. The combination of stable preferences and the optimality condition in Eq. (5) therefore imposes a cointegration restriction on the co-movements of the log-consumption ratio and log-price ratio series.

Equation (5) provides a structural equation that can be estimated consistently with cointegration techniques. From an economic perspective, this structural equation allows for a neat interpretation of \(\theta \) where gross complementarity between private and government consumption corresponds to estimates of \(\theta \) between zero and one, while estimates of \(\theta \) greater than or equal 1 imply gross substitutability. It is important to point out that the estimation equation is void of the intertemporal elasticity of substitution. This allows us to focus on uncovering the intratemporal elasticity of substitution without having to make any stringent assumptions on the intertemporal elasticity of substitution.

It is worth noting that the theory does not provide an exclusive guidance on which variable, the consumption ratio or price ratio, to employ as the regressand. While our preferred specification selects the consumption ratio as the regressand, we also estimate the IES using the price ratio as the regressand. We go on to show in Sect. 5 that selecting the consumption ratio as the regressand as in Eq. (5) is indeed the preferred specification.

4 Econometric evidence

Following the results from the previous section, the basic equation of interest for our analysis of uncovering the IES between private and public consumption is given as:

$$\begin{aligned} \ln \left( \frac{C_{t}}{G_{t}}\right) = \theta \ln \left( \frac{P^{g}_{t}}{P^{c}_{t}}\right) + \nu _{it}. \end{aligned}$$
(6)

Additionally, we specify the \(\nu _{it}\) to follow the process,

$$\begin{aligned} \nu _{it} = \alpha _{i} + \zeta '_{i}{\textbf{f}}_{t} + u_{it}, \end{aligned}$$
(7)

where \(\alpha \) captures country-specific effects. Additionally, unlike previous studies that have estimated the IES, we explicitly allow for a set of unobserved common factors \({\textbf{f}}_{t}\) with country-specific ‘factor loadings’ \(\zeta '_{i}\) to account for the unobserved factors and economic spillovers that may drive the relationship under consideration. This parameter indicates the impact of the factor on unit i, and \(u_{it}\) is a pure idiosyncratic error. The common factors by design do not only drive the consumption ratio, but also affects the price ratio. The latter is synonymous to arguments in Mundlak et al. (2008), Holly et al. (2010) and Eberhardt and Teal (2013). This generates a different type of endogeneity that is not easily remedied through instrumental variable estimations. Moreover, these common factors can encompass either weak factors, strong factors or both. The weak factors include local spillover effects that arise from shared cultural heritage, geographic proximity, economic and social interactions and integration (Banerjee and Carrion-i Silvestre 2017). On the other hand, strong factors capture more global factors such as global shocks or even synchronized changes in consumer preferences (e.g., financial crisis in 2008, the 1970 s oil crisis, and the ongoing COVID-19 pandemic, which calls for almost synchronized policy actions across the world). Together, these common factors, weak and strong, should not be discounted as omitted variables but instead a set of latent drivers of these macroeconomic variables. In the presence of these common factors, one cannot correctly identify the parameter of interest \(\theta \) unless the unobservable factors in the error term \(\nu _{t}\) are accounted for.Footnote 13

Notice that if both \(\ln (C_{t}/G_{t})\) and \(\ln (P^{g}_{t}/P^{c}_{t})\) are difference-stationary I(1) processes, and \(\nu _{t}\) is a stationary I(0) process, then this implies that the two variables are cointegrated. We will provide formal evidence for this cointegration and unit root properties in Sect. 4.1. Here, the parameter \(\theta \) can be estimated consistently from Eq. (6) even though there may be measurement errors or stationary omitted variables. That is, the gradient parameter can be estimated consistently without the assumption that the regressors are econometrically exogenous. This is possible because cointegration estimators possess super-consistency properties (Pedroni 2019).

Although Eq. (6) is the basic cointegrating equation employed by existing studies and is directly tied to the theoretical setup in section 5, a plethora of macro-econometric specifications arise to control for potential econometric issues. Specifically, an applied econometrician can either assume parameter homogeneity—in which case the key parameter \(\theta \) is assumed to be the same across cross-sectional units (i.e., \(\theta _{i} = \theta \)) for all countries—or parameter heterogeneity across countries in the panel—in which case \(\theta \) varies for each country. Here, we follow the literature and pool the data. Besides, pooling the data can lead to efficiency gains (Baltagi and Griffin 1997; Baltagi et al. 2008; Hsiao 2007). Importantly, we present homogeneity tests to formally confirm this assumption before proceeding with the estimation.

Moreover, the issue of cross-sectional dependence versus independence become an important assumption in estimating the IES under panel data. Furthermore, even if one assumes cross-sectional dependence, there are a variety of ways to deal with it depending on whether cross-sectional dependence in the error structure is weak or strong. For instance, does one apply cross-sectionally augmented means as in the Pesaran (2006) approach or should one address it by time demeaning the data as in Francois and Keinsley (2019) and Herzer and Morrissey (2013)? The choice of how cross-sectional dependence is treated can have a non-trivial impact on the accuracy of the estimated IES (see for example Sarafidis and Wansbeek 2012; De Hoyos and Sarafidis 2006, for a discussion on the treatment of cross-sectional dependence).

Furthermore, the choice of estimators is an important decision for the econometrician. In the context of nonstationary panel, there is a plethora of estimators—e.g., panel dynamic OLS (DOLS), fully-modified OLS (FMOLS) among other—that one can choose from. These estimators are designed to handle nonstationary data and are able to circumvent endogeneity issues arising from certain forms of simultaneity, omitted variables, measurement error, and reverse causality. This is because these estimators possess a superconsistency property under cointegration (see, Pedroni 2019). Nonetheless, each of them have their strengths and weaknesses. For instance, Kao and Chiang (2000) study the asymptotic distributions for the OLS, FMOLS, and DOLS, and finds that the DOLS outperforms both the OLS and FMOLS. These estimators do not traditionally address cross-sectional dependence. However, as we show in Sect. 5, they can be easily modified to address the issue of cross-sectional dependence.

In summary, with the exception of the assumption of pooling the data prior to estimation we do not make any additional assumptions on cross-sectional dependence, choice of estimator or regressand prior to estimation. Specifically, we remain agnostic and present an array of models that considers these assumptions individually or jointly. In the baseline specification in Eq. (6), we pool the data; hence, constraining the parameter \(\theta \) to be common for all countries in the panel. We formally discuss and test for the assumption of homogeneity in Sect. 4.1.3. This assumption may seem stringent on face-value nonetheless, we discuss how the alternative of allowing heterogeneity in the estimated parameter can lead to severe misinterpretation or inaccurate inferences due to theoretical restrictions of \(\theta \). We then go on to estimate the IES under several sets of model assumptions and conduct a battery of post-estimation diagnostics to compare the validity of the estimated IES across models.

4.1 Data and pre-testing

We employ annual data for 1970 to 2018 from the World Development Indicators (World Bank, 2019) for 17 European economies. These countries were selected primarily due to data availability. Additionally, the start and end dates for the data are driven by missing observation for years leading up to 1970 and after 2018. The consumption ratio is derived by dividing household final consumption expenditure by general government final consumption expenditure, both in 2010 constant dollars. The corresponding prices are computed as the implicit price deflators, which are constructed by dividing the nominal private and government consumption series by their respective constant price series. Figure 3 depicts the two series for all 17 European countries in the sample. It is generally evident that there is strong persistency and co-movements of the two series. It is therefore intuitive, at least, to assume at face-value that the two series are individually I(1) and potentially cointegrated. We now turn our attention to formally test these observations. However, for countries such as Luxembourg, Spain, and Switzerland, we observe a divergence in the movement of the two series. Moreover, the strength of co-movement varies across countries. This naturally prompts the need to go beyond pooled panel analysis and explore country-specific analysis.

Fig. 3
figure 3

Private and government consumption ratio and relative price. The horizontal axis is the time horizon (in years). The left and right y-axis represent the log consumption and relative price ratio, respectively

Recall that preferences are stable, implying that the residual term \(\nu _{it}\) is stationary, and that our basic equation for estimation is a cointegrating regression if the regressor and regressand are both integrated processes. In this sense, the combination of stable preferences and the optimality conditions derived in Eq. (6) imposes a cointegration restriction on the co-movement of the log consumption and price ratio series. Since the assumption of stationarity is placed on the error term in the empirical model, the natural litmus test for cointegration regression in our case is to utilize residual-based cointegration tests. To this end, we formally test whether (i) the log consumption ratio and log price ratio are I(1) processes, and (ii) whether the error term \(\nu _{it}\) is stationary and whether there exists a cointegration relation in Eq. (6). In what follows, we conduct a thorough pre-testing analysis to confirm a cointegration relation in Eq. (6).

4.1.1 Unit root tests

It has been widely shown that most of the unit root tests for time series have low power and therefore accept the null of a unit root too often. The extension of unit root tests to a panel framework improves the power of unit root testing by incorporating information contained in the cross-sectional dimension. In this study, we pool the data for the 17 countries to perform four first generation panel unit root tests (Levin et al. (2002) (LLC), Breitung (2000), Im et al. (2003) (IPS), and Maddala and Wu (1999) (ADF)). The first two tests—LLC and Breitung—assume a common autoregressive coefficient across all cross sections, while the final two (IPS and ADF) allow for more flexibility by permitting the autoregressive coefficient to vary across cross-sections. These tests, however, do not account for cross-sectional dependence—they assume cross-sectional independence—which is inadequate and could lead to significant size distortions in the presence of neglected cross-sectional dependence (Baltagi and Pesaran 2007). Hence, in addition to these first generation tests we also consider the cross-sectionally augmented IPS (CIPS) test proposed by Pesaran (2007). The CIPS filters out any cross-sectional dependency by augmenting the ADF regression with the cross-section averages of lagged levels and first-differences of the individual series (See for example Herzer and Grimm 2012; Baltagi and Pesaran 2007, for a discussion on second-generation unit root tests).

Table 1 Unit Root Tests

Table 1 reports the formal panel unit root test results. It is evident from the table that all five tests fail dramatically to reject the unit root null hypothesis for the level series. However, the unit root null hypothesis is strongly rejected when we employ the first differenced series. The results therefore confirm that the log price and consumption ratio series are non-stationary I(1) processes.

4.2 Choosing the regressand

In the context of estimating the IES, existing studies often remain silent on the choice of regressand. Naturally, the IES can directly be recovered from the equilibrium relationship in Eq. (6) by setting the consumption ratio as the regressand (Dawood and Francois 2018; Kwan 2009). However, because the equilibrium condition offers no guide on which variable to set as the regressand, some studies such as Amano and Wirjanto (1998) instead use the price ratio as the regressand. Hence, these studies estimate the inverse of the IES and then recover the IES. In theory, irrespective of the choice of regressand, the econometrician should uncover the same IES. However, using an empirical example of bivariate models, Ng and Perron (1997) show that least-squares estimates can have very poor finite sample properties when normalized, with regards to choice of regressand, in one direction but are well behaved when normalized in the other. This occurs when one of the I(1) variables is a weak random walk or is nearly stationary. In what follows, we provide discussions from the Ng and Perron rule, Granger causality tests from a panel vector error correction model, and pairwise Granger causality tests to provide some insight into selecting the regressand.

4.2.1 The Ng and Perron rule

As proposed by Ng and Perron (1997) and applied in Kwan (2009) and Dawood and Francois (2018), it is more desirable to put the more integrated series as the regressor (explanatory variable) and the less integrated series as the regressand (dependent variable). It is evident that most of the p-values obtained for the case of the level series suggests that the log consumption ratio has a stronger random walk component than the log price ratio (Table 1). That is, the log consumption ratio is less integrated than the log price ratio for the CIPS tests, but all the other tests suggest that the log price ratio is less integrated that the consumption ratio. The p-values from the unit roots tests suggest mixed results. In particular, while the CIPS test suggests estimating Eq. (6) with the consumption ratio as the regressand, all the other tests suggest employing the price ratio as the regressand. However, since the CIPS test controls for cross-sectional dependence, it is considered superior to the other tests. Thus, the Ng and Perron rule suggests using the consumption ratio as the regressand based on the CIPS test results.

4.2.2 Granger causality

We now turn our attention to utilizing Granger causality tests in guiding the direction of causality. We present two exercises: (1) pairwise Granger causality tests and (2) panel Granger causality based on a panel vector error correction model (VECM). We start with the standard pairwise Granger causality tests and report test results based on Dumitrescu and Hurlin (2012, henceforth DH) which controls for cross-sectional dependence and allows for heterogeneity of this causal relationship. Thus, for a given pair of economic variables, X and Y, the null hypothesis of the DH-test is that X does not homogeneously cause Y. We also report results from the standard Granger causality tests.

Table 2 Pairwise Granger-Causality Tests

Table  2 presents the results from the pairwise Granger causality tests. Columns 1 and 2 report the results from the standard Granger causality and DH tests, respectively. With the standard test, we find evidence of uni-directional causality going from the consumption ratio (\(\ln C^R_t\)) to the price ratio (\(\ln P^R_t\)). That is, the test rejects the null hypothesis that consumption ratio does not Granger cause price ratio at the 1% significance level while simultaneously failing to reject the null that the price ratio does not homogeneously cause consumption. Turning to the DH test, we reject the null hypothesis in both cases. Specifically, we reject the null that the price does not homogeneously cause consumption at the 1% significance level and vice-versa. This provides evidence of a bi-directional causal relationship.

Table 3 Panel Granger Causality based on Panel VECM

Table 3 presents the results from the Granger causality based on a panel VECM, which test for both short-run and long-run causality. Guided by the Schwarz information criteria, the lag structure is set to two. The results in Table 3 show evidence of uni-directional short-run causality running from consumption to price. In the long-run, however, there is evidence of bi-directional causal relationship between consumption and prices.

4.2.3 Discussion

The results from the Ng and Perron (1997) rule and the Granger causality tests provide a mixed conclusion on which variable to select as the regressand. While pre-estimation tests are useful in designing the empirical study, post-estimation diagnostics help increase validation and reliability of the estimates from the empirical design. Ng and Perron (1997) document that the choice of regressand has implications for residual-based unit-root tests for cointegration. Consequently, rather than keeping to one specification, we present empirical results from both scenarios and explicitly present post-estimation diagnostic tests to validate estimates of the IES. More precisely, we report detailed post-estimation diagnostic tests on the desirable features of well-behaved residuals along the lines of Eberhardt and Presbitero (2015). For completeness, we present cointegration tests with normalization in both direction.Footnote 14

4.2.4 Cointegration tests

In this section, we present evidence of a cointegration relation in the main equation of interest, Eq. (6). The primary goal here is to validate the assumption that preferences are stable. This requires testing the stationarity property of the residuals. To this end, we test for the presence of cointegration in Eq. (6) using residual-based cointegration tests. Specifically, we employ four standard panel and group test statistics suggested by Pedroni (1999). The standard Pedroni tests, however, do not account for potential cross-sectional dependence. In the presence of cross-sectional dependence that may arise from multiple unobserved common factors, an assumption of cross-sectional independence can lead to biased inference (Herzer and Morrissey 2013; Baltagi and Pesaran 2007). In order account for cross-sectional dependence, we utilize the version of the standard Pedroni tests, which deals with cross-sectional dependence in the manner of Neal (2014). The strategy involves time demeaning of the data for each cross-sectional unit and variable (See, Neal 2014, for the theory and implementation details). For completeness, we report test results in which we assume cross-sectional independence.

In addition to the residual-based tests, we report cointegration tests based on Westerlund (2007). The cointegration test by Westerlund is based on structural rather than residual dynamics and therefore, do not impose any common factor restriction. Importantly, Westerlund (2007) compares small sample performance of the tests relative to the performance of the popular residual-based test by Pedroni (1999) and find good size accuracy, and that they are more powerful than the residual-based test.Footnote 15 The test is designed to test the null by inferring whether the error correction term in a conditional error correction model is equal to zero. If the null hypothesis of no error correction is rejected, then the null hypothesis of no cointegration is also rejected. Each test is able to accommodate individual-specific short-run dynamics including serially correlated error terms, non-strictly exogenous regressors, individual-specific intercept, and individual-specific slope parameters. We utilize bootstrap tests to account for cross-sectional dependence.

Table 4 presents results for two cases: one where the consumption ratio is used as the regressand and another scenario where the price ratio is used as the left-hand-side variable. Test results generally reject the null hypothesis of no cointegration at conventional levels of statistical significance. In particular, with the exception of the Group normalized statistic and the Pedroni Panel PP statistic, seven of the nine cointegration tests reject the null of no cointegration in the case where the consumption ratio is selected as the regressand. Similarly, six out of the nine tests in Table 4 rejects the null when the price ratio is employed as the regressand in Eq. (6). The non-rejection of the null hypothesis reinforces the need to estimate the model in Eq. (6) for the two choices of regressand.

Table 4 Panel Cointegration Tests

5 Benchmark estimate of IES

In this section, we present the results from the panel cointegration regression. All regression specifications account for country-specific fixed effects. To explicitly highlight the relevance of how: (1) the choice of estimator, (2) assumptions on and treatment of cross-sectional (in)dependence, and (3) the choice of regressor and regressand in Eq. (6) impacts the size of the estimated IES. We report an array of estimators that consider the aforementioned scenarios. More specifically, for the choice of estimators we utilize the pooled versions of Dynamic-OLS (DOLS) estimator by Kao and Chiang (2000) and Mark and Sul (2003) and the Fully-Modified OLS by Pedroni (2001a). Without any modification, these estimators assume cross-sectional independence in the data. As previously discussed, in the presence of cross-sectional dependence, this can lead to inaccurate estimates. To account for potential cross-sectional dependence in the data, we employ two approaches. First, we use the cross-sectional demeaning approach applied in Herzer and Morrissey (2013) and Francois and Keinsley (2019). This approach subtracts the cross-sectional average of each variable in the original data employed in the estimation equation. Specifically, we replace each variable \(X_{t}\) in Eq. (6) by the transform \({\widetilde{X}}_{t}\) where,

$$\begin{aligned} {\widetilde{X}}_t = X_{t} - {\bar{X}}_{t}, \text {and} {\bar{X}}_{t} = \frac{1}{N}\sum ^{N}_{i=1} X_{it} \end{aligned}$$
(8)

The estimators associated with the cross-sectionally demeaned (CD) variables are DOLS-CD and FMOLS-CD. It is important to note that while the cross-sectional demeaning approach has the advantage of avoiding over-parameterization as it preserves the number of regressors in the specification, it only addresses potential weak cross-sectional dependence. If the type of cross-sectional dependence is strong or more complex, this strategy of accounting for cross-sectional dependence would not be sufficient. Specifically, as discussed in De Hoyos and Sarafidis (2006), while cross-sectional (time-) demeaning removes the mean impact of the factors, in the polar case where the variance of the coefficient on the factor loadings \(\lambda _{i}\) in Eq. (7) grows large, time demeaning will be less effective. This is because even if the mean impact of the factors has been removed, there will still be a considerable amount of cross-sectional dependence left out in the disturbance (see, De Hoyos and Sarafidis 2006; Sarafidis and Wansbeek 2012, for detailed discussion).

An alternative and more powerful approach to address cross-sectional dependence is to include the cross-sectional averages of all the variables in the regression equation á la Pesaran (2006) and Holly et al. (2010). This technique is a straightforward way of dealing with multi-factor and more complex cross-sectional dependence in the data. Consequently, we employ the pooled common correlated effect (CCEP) estimator by Pesaran (2006), which is naturally designed to deal with cross-sectional dependence in the data head on.Footnote 16

Finally, recall that economic theory in Sect. 4 suggests that there is no “silver bullet” on which variable to employ as regressand (or regressor). Moreover, using the Ng and Perron rule of thumb does not offer a clear decision on which variable, \(\ln (C_{t}/G_t)\) or \(\ln (P^g_{t}/P^c_t)\), to employ as the regressand or regressor. To this end, we report results for a case where the consumption ratio is employed as the regressand (e.g., Dawood and Francois 2018; Kwan 2009) and another scenario where the price ratio is used as the regressand (e.g., Amano and Wirjanto 1998). In summary, we run 10 different regressions to uncover the intratemporal elasticity of substitution. More importantly, we report several post-estimation results to compare the validity of estimates from the estimation choices.

Table 5 Estimated Values of IES with Consumption Ratio as Regressand

Tables 5 and 6 present the main results. We begin by focusing on the results from Table 5, which utilizes the consumption ratio as the regressand. Panel A presents the pooled estimates of the IES. There are a number of observations: First, all estimated values of \(\theta \) are positive, less than 1, and range between a low value of 0.297 to a high value of 0.901. The estimates are statistically significant at conventional levels. The positive estimates satisfy the preference properties of non-negativity of the IES. Additionally, because all the estimated values of \(\theta \) are below unity, they suggest that private and public consumption are gross complements in these European economies. The finding of gross complementarity is similar to findings in Dawood and Francois (2018) and Kwan (2009) in the case of African and East Asian countries, respectively. The results are, however, in contrast to findings in Amano and Wirjanto (1998) who estimate the IES to be 1.56 in the case of the United States, implying gross substitution between the two goods. Second, estimates from the FMOLS and DOLS estimators that assume cross-sectional independence yield the smallest estimates, 0.297 and 0.311, respectively. In contrast, accounting for the presence of cross-sectional dependence of any form, as shown in columns (3)–(5) drastically increases the size of the estimated IES. In particular, the FMOLS-CD produces the largest \(\theta \) amongst the estimates that accommodate cross-sectional dependence. The CCEP estimator on the other hand yields the smallest value of 0.515 amongst the estimators that control for cross-sectional dependence. Finally, the DOLS-CD uncovers an IES of 0.738. This generally suggests that an assumption of cross-sectional dependence (or independence) can affect the size of the estimated IES non-trivially.

To better appreciate these estimates, we present a number of post-estimation diagnostic tests that focus on the behavior of the residuals from the estimators in Table 5. The diagnostics are meant to evaluate the performance and efficiency gains of the selected estimators. More importantly, they provide a yardstick for validating a particular model assumption ex post. The diagnostic tests are presented in Panel B of Table 5. The post-estimation diagnostics include cross-sectional dependence given by the Pesaran (2015) cross-sectional dependence test, unit roots (CIPS) tests, and the root mean square error (RMSE). In the context of cross-sectional dependence tests, a desirable property of the residuals from the estimators should be that they exhibit cross-sectional independence. Hence, a failure to reject the null hypothesis of cross-sectional independence is the desired outcome. Evidently, from Panel B, the null of cross-sectional independence is rejected for all estimators except for the DOLS-CD estimator. Specifically, the latter estimator fails to reject the null of cross-sectional independence in the error structure. Turning to the unit root test, we apply two unit root tests—the Maddala and Wu ADF and the CIPS unit root tests—to the residuals from the regressions in Table 5. The goal is to check the stationarity property of these residuals. Recall that a failure to reject the null of unit root violates the stationarity property of preferences described in Sect. 4. As shown in Panel B in Table 5, the p-values associated with the ADF unit root test decisively suggest the rejection of the null of unit root at the 10% significance level or better. However, under the CIPS unit root test, the null is only rejected in the case of DOLS-CD. Finally, the RMSE ranks the CCEP followed by the DOLS-CD as the best predictor of the observed data as they have the smallest RMSE, respectively.Footnote 17 Overall, the DOLS-CD model which accounts for cross-sectional dependence outperforms the other competing models.

Table 6 Estimated Values of IES with Price Ratio as Regressor

We now turn our focus to Table 6. The results presented here are from the estimations where the price ratio is employed as the regressand. We employ the same set of estimators from Table  5. It is important to note that with the price ratio as the regressand, the estimated parameter, \(\beta \) is the inverse of the IES. Hence, to recover the IES, one needs to take the inverse of the estimated values. From Table 6, \(\beta \) is estimated to lie between 0.318 and 0.444, and these estimates are statistically significant at the 1% level. This implies that the intra-period elasticity of substitution parameter, \(\theta \), ranges from 2.252 to 3.145, suggesting that private and public consumption are gross substitutes. These results are in stark contrast to the gross complements finding between the two goods when the consumption ratio is employed as the regressand. Interestingly, the post-estimation diagnostics presented in Panel B of Table  6 generally suggest that the residual from these regressions violate the set of desirable properties discussed earlier. In particular, compared to the behavior of the residuals from the estimations in Table 5, the null hypothesis of cross-sectional independence of the residuals are strongly rejected for all estimations. Additionally, test statistics from the unit root test show that the residuals from all the estimations do not poses the required stationarity property implying that preferences are not stable. Finally, while the RMSE for the estimations in Table 6 is on average lower (i.e., 0.0728) than in the case where the consumption ratio is employed as the regressand (i.e., 0.0794), the difference is marginal. In summary, one can confidently conclude that the results in Table 5 are the more reliable estimates, and the DOLS-CD estimator is preferred over the other estimators. Consequently, the preferred estimated IES is 0.738 as given by the DOLS-CD estimator in column 3 of Table 5.

5.1 Cross-sectional augmented distributed lag (CS-DL) estimator

It is worth mentioning that failing to account for cross-sectional dependence would bias parameter estimates only if unobservable factors are correlated with regressors (as stated in the paper), but they would reduce parameter efficiency if these factors are correlated with the dependent variable. To purge the effect of strong CD from estimates, we adopt the pooled Common Correlated Effect estimator (CCEP), developed by Pesaran (2006), as an alternative to cointegration regression techniques, such as DOLS and FM-OLS run on cross-sectional de-meaned data. Unfortunately, the CCEP is not the best candidate to this aim as it is a static procedure of regression and hence is incapable of purging the effect of dynamic adjustment and simultaneous feedback between regressand and regressors (unlike DOLS and FM-OLS). The set of dynamic procedures, capable of estimating a potentially cointegrated regressions and account for strong CD, includes the cross-sectionally augmented version of the Auto-Regressive Distributed Lag (CS-ARDL) or the Cross-Sectionally augmented Distributed Lag (CS-DL) model by Chudik et al. (2017, 2013).

These two estimators have their merits and drawbacks as discussed by Chudik et al. (2013). In particular, the main advantage of the CS-DL approach relative to the CS-ARDL approach is its superior small sample performance when the time series dimension of the panel is moderate. Specifically, for the consistency of the ARDL estimates, sufficiently long lags are necessary, whereas specifying longer lags than necessary can lead to estimates with poor small sample properties. The CS-DL method is more generally applicable and requires only that a truncation lag order be selected. A drawback of the CS-DL technique relative to the CS-ARDL approach is that the CS-DL estimates of long-run effects are not consistent when there is significant feedback from the regressand to regressor. Nonetheless, Chudik et al. (2016) argue that even with this bias, the performance of CS-DL in terms of RMSE is much better than that of the CS-ARDL approach when T is moderate (which is the case in our empirical application). Furthermore, the CS-DL approach is robust to a number of departures from the baseline specification, such as residual serial correlation, and possible breaks in the error processes. To this end, we employ the CS-DL estimator, which suits our purposes given the small-sample time series properties of our data (i.e., \(T = 49 <100\)) and our small size of the cross-sectional units \(N= 17\). We augment the model with four lags of the cross-sectional averages of the dependent variable. The lag length for the cross-sectional average is selected using the rule of thumb of \(T^{1/3}\) suggested by Chudik et al. (2017, 2013).Footnote 18

Table 7 CS-DL Estimates of the IES

Table 7 reports the results from the CS-DL estimation. For completeness, we report the estimates of the IES for different specifications. While the estimation with the consumption ratio as the dependent variable is our preferred model, we report the results from the case where the price ratio is employed as the dependent variable (columns 5–8) to uncover the IES to highlight the potential model misspecification. The estimates from columns 1–4, where the consumption ratio is employed as the regressand, reveal an IES value ranging from 0.55-\(-\)0.67. Importantly, the post-estimation diagnostics for the presence of unit root in the residual is strongly rejected for all specifications. While the null of CD test is rejected in columns 1 and 2, we fail to reject the null for columns 3 and 4. This suggests that the specifications in columns 3 and 4 produce a more reliable estimates of the IES (0.62 and 0.67, respectively), which are in line with estimates from the preferred DOLS results in Table 5. Switching to the case where the price ratio is used as the dependent variable á la Amano and Wirjanto (1998) (columns 5–8), the results uncover unusually large estimates of the IES (ranging from 3.16 to 3.07).Footnote 19 Notice that these values, when embedded in the standard RBC model described in Section 2, imply a large negative response of private consumption, which fails to predict the empirical evidence of a positive consumption response to an increase in government consumption. Importantly, only the specification in column 8 satisfies all the post-estimation checks. These further suggest that estimates using the price ratio will lead to misspecification and implausible estimates of the IES.

What does the preferred estimate imply for Edgeworth complementarity (substitutability) between private and public consumption? As discussed in Sect. 2, under a CRRA utility function, the sign of the cross-partial derivative, \(U_{CG} = \partial (\partial u/\partial C)/\partial G\) governs the Edgeworth substitutability (complementarity) between the two goods. Specifically, given the utility function in this study, Amano and Wirjanto (1998) show that the cross partial derivative, \(U_{CG}\) depends on the difference between the intertemporal elasticity of substitution (\(1/\gamma \)) and the IES (i.e., \(sgn[U_{CG}] = sgn[1/\gamma - \theta ]\)). Here, private and public consumption are therefore Edgeworth complements (substitutes) if the intertemporal elasticity of substitution is greater (less) than the intratemporal elasticity of substitution. If the two preference parameters are equal, then changes in government consumption have no impact on the marginal utility of private consumption. Fixing the intertemporal elasticity at 0.8 as in Havránek (2015), one can observe that \(sgn[U_{CG}] > 0\) for the preferred estimate of the IES (i.e., 0.738) as given by DOLS-CD estimate in Table 5 as well as the estimates from columns 3 and 4 from the CS-DL estimates in Table 7. The positive sign of the cross partial derivative, \(U_{CG}\), suggests that private and government consumption are Edgeworth complements in the Pareto sense—a result consistent with Fiorito and Kollintzas (2004). The finding that the two goods are Edgeworth complements is robust to lower assumed values of the intertemporal elasticity up to a minimum value of 0.79.

The immediate implication for policy is that with Edgeworth complementarity between the two goods, an increase in government consumption would increase the marginal utility of private consumption. If the rise in the marginal utility is stronger than the standard wealth effect induced by tax- (or deficit-) financing of the increase in government consumption, then private consumption would rise. Indeed, as demonstrated in Fig. 2 the baseline estimate of \(\theta \) of 0.738 implies an increase in private consumption following a rise in government consumption. Consequently, a fiscal expansion that increases government consumption would have Keynesian effects on real output in European economies. These Keynesian effects are similar to finding in Amendola et al. (2020).

6 Heterogeneity in estimated IES

A particular feature of the empirical model in the previous section is the assumption of homogeneity. This assumption leads to efficiency gains from pooling the data, which is a desirable feature for the reliability of estimates (see, Baltagi and Griffin 1997; Baltagi et al. 2008, for a discussion). Moreover, one can generate more accurate predictions for individual outcomes by pooling the data rather than generating predictions of individual outcomes using the data on the individual in question. Specifically, if individual behaviors are similar, conditional on certain variables, panel data provide the possibility of learning an individual’s behavior by observing the behavior of others (Hsiao 2007). Thus, it is possible to obtain a more accurate description of an individual’s behavior by supplementing observations of the individual in question with data on other individuals.

In this study, however, we employ aggregate consumption data at the country-level. Additionally, the prior assumption of pooling the data induces an additional level of aggregation, which while it has its benefits, inadvertently over-aggregates the data. This invokes a strong “representative agent” assumption. Here, if individual countries in the panel are heterogeneous in terms of the size (and even the sign) of the IES, the time series properties of the aggregate data would be starkly different from those of dis-aggregate data, which in this case is the country-level data (Granger 1988; Pesaran 2003; Pedroni 2001b; Chudik et al. 2017). Importantly, policy evaluation based on aggregate data may be grossly misleading given the latter. These motivate the question: How large is cross-country variation in preferences relative to the pooled estimate? To answer this question, we relax the assumption of homogeneity of the IES and allow it to vary across countries. We therefore employ the Pesaran (2006) Common Correlated Effects Mean Group (GM-CCE) estimator, which allows for the IES to vary across each country and accounts for cross-sectional dependence in the data.Footnote 20 It is worth mentioning and acknowledging that, since consistency of group mean estimates is ensured by the large number of cross-sectional units, the GM-CCE estimates should be taken with caution. Given the limited number of countries used in the regression (i.e., \(N= 17\)), we follow Bond et al. (2010) and also report robust mean of single-country elasticities.Footnote 21 Finally, we employ the consumption ratio as the dependent variable.

Table 8 Country-specific estimates of IES

Table 8 presents the results from the heterogeneous panel estimation. The last row reports the panel group-mean estimate of the IES, which uncovers an estimated value of \(\theta \) equal to 0.59.Footnote 22 This estimated value is smaller in size compared to the desired pooled estimate from DOLS-CD estimator in Table 5 (i.e., 0.738). Nonetheless, the group-mean estimate still implies that on average, private and public consumption are gross complements. More importantly, when combined with the relevant intertemporal elasticity of substitution, this estimated value suggests that the two consumption goods are complements in the Edgeworth-Pareto sense. Consequently, the conclusion from the heterogeneous panel estimator reinforces the main inference from the baseline results produced by the homogeneous case. Second, the table reports country-specific estimates of the IES, which highlights strong heterogeneity across countries. The estimates from both the CCE country-by-country results and the robust regression are similar. The results can be categorized into three groups: (1) Countries where the IES is positive, less than unity, and statistically significant. The countries include: Austria, Finland, France, Germany, Greece, Italy, Luxembourg, Netherlands, Norway, Spain, Sweden, and United Kingdom although the United Kingdom is closer to 1. In these economies, private and government consumption gross complements. (2) Countries where the estimated IES is positive, greater than 1, and statistically significant. Here, there are two countries that fall in this category—i.e., Denmark and Ireland. In these countries, the two goods in question are best classified as gross substitutes. (3) Economies including Belgium, Portugal, and Switzerland where the estimated IES is statistically insignificant, suggesting that the null hypothesis that \(\theta \) is equal to zero cannot be rejected. As discussed in Dawood and Francois (2018), this implies that either private and government consumption are perfect complements or that the identifying assumptions in this study do not hold for these countries. The heterogeneous analysis sheds light on why the panel estimates consistently found private and public consumption to be gross complements.

Similar to the panel analysis, we can determine whether the “well-defined" country-by-country estimates imply Edgeworth substitutability/complementarity in individual countries. More precisely, setting the intertemporal elasticity of substitution (\(1/\gamma \)) to a reasonable value of 0.8 as in Havránek (2015), we are able to demonstrate the importance and subsequent implication of the size of the IES across countries. With \(1/\gamma \) fixed to 0.8, it is evident that private and public consumption are Edgeworth complements in all the countries where the IES is less than unity except for Denmark and Ireland. That is, the sign of the cross partial \(U_{CG}\) is greater than zero for these countries but for Germany, Greece, Norway, and United Kingdom where it less than zero (i.e., sign of \(U_{CG} = [0.8-0.8224]<0\), \(U_{CG} = [0.8-0.8832]<0\), \(U_{CG} = [0.8-0.885]<0\) and \(U_{CG} = [0.8-0.9986]<0\), respectively). For the two countries where the IES is greater than one, private and public consumption are unambiguously Edgeworth substitutes. The conclusions from the country-by-country CCE estimator hold true for the results from the robust regressions. These findings highlight the fact that the same fiscal policy involving changes in government consumption will likely yield very different outcomes in different countries, implying that policy design should be country-specific.

6.1 Explaining the cross-country heterogeneity

The heterogeneity analysis in Table 8 reveals that the size of the IES varies largely across countries. This pushes for the need to understand the factors that may explain the observed variation in the estimated IES. The natural correlates to consider are the components of government consumption—i.e., public spending on defense, education, health, and public order and safety. According to the functional definition of government consumption, these components can be classified into pure public goods (i.e., defense spending, law courts, and public order and safety) and merit goods (i.e., education and health). A priori, one would expect that the pure public good components would be negatively associated with the size of IES. This is because these components do not have immediate substitutes that can be provided by the private sector, at least on a large scale (Evans and Karras 1996). This therefore reduces the substitutability of these public goods. In contrast, merit goods such as public education and health can be provided by the private sector; hence, conditional on quality, they are likely to be easily substitutable with private education and health, respectively. Consequently, one would expect a positive relationship between these merit goods and the size of the IES. Nonetheless, these inherent characteristics of public and merit goods may not fully summarize the plausible relationship these goods may have with the IES. More precisely, merit goods are thought to have strong positive externalities, and are therefore complementary to private consumption. Thus, one can also expect a negative relationship between merit goods and the degree substitutability as measured by the size of the IES. This presents a more complex relationship between the IES and the components of public consumption.

Beyond these aforementioned primary correlates, the size of government, which is given as government consumption as a share of GDP, can be a predictor of the IES. Here, as governments get bigger, they are likely to start providing more merit goods and services such as education and health relative to the pure public good/services such as defense they traditionally provide (Karras 1994). In this sense, since merit goods can be provided by the private and are therefore more substitutable for private consumption, one would expect that the size of the IES will be increasing in government size as government provides more merit goods.

Table 9 Correlates of the IES at the country level

Table 9 presents the OLS regression estimates for how the size of the IES is related to the share of defense, public order and safety, law court education, health spending in total expenditure, and government size. We break our analysis into several parts by examining the association between the merit and public components of public consumption and the IES (Columns 1 and 2, respectively). We then combine all the predictors and examine how each component is associated with the IES (Column 3). Finally, Column 4 presents the full model specification that includes all the components from columns 1 and 2, government size, and its squared value as predictors of the size of the IES.

We start our discussion with Column 1 in Table 9. The column documents that the merit good components—i.e., education and health—varies positively with the size of the estimated IES. Switching to Column 2, Table  9 indicates that there is no statistically significant relationship between the estimated IES and pure public goods components, which comprise defense spending, law courts, and public order and safety spending in total expenditure. Furthermore, Columns 3, which includes both public and merit goods as potential correlates in the regression, reinforces the results from Column 1. Finally, Column 4, which represents the full model, still shows that health expenditure is still positively correlated with the IES; however, the coefficient of education is negative but not statistically significant. Interestingly, the coefficient of public order, which has been negative and not statistically significant in the other specification, is now negative and statistically significant at the 10% level. Additionally, we find a U-shape relationship between the size of IES and government size. This finding suggests that there is a point beyond which as governments gets bigger, the degree of substitutability between government and private consumption becomes stronger.

In summary, there is some evidence that the variation in the IES is positively correlated with the share of health, and to some degree education, expenditures in government consumption and the government size. On the other hand, we find that public order and safety has a negative relationship with the IES, suggesting that an increase in the public good public order in government consumption will likely strengthen the complementarity between public and private consumption at the aggregate level. These findings are analogous to findings in Evans and Karras (1996), who find that non-education component of non-defense spending such as health drives the substitutability between private and government consumption. Meanwhile, similar to public order, the authors find that the higher share of defense spending in government expenditures, the stronger the complementarity between private and public consumption.

7 Concluding remarks

Accounting for utility-enhancing public consumption is important in mediating the effect of changes in government consumption on private consumption. In this paper, we combine theory and empirical work to uncover the intratemporal elasticity of substitution between private and public consumption in European economies. The empirical work explicitly accounts for cross-sectional dependence that may arise due to global shocks and economic spillovers. The latter makes this study the first in the literature to address the issue of cross-sectional dependence while estimating the IES in the context of panel data. Importantly, we remain flexible by adopting several estimators and data treatment in the empirical study. We then rely on simple but effective post-estimation diagnostics to guide the validity of the estimated IES. We find point estimates of the intratemporal elasticity of substitution that reveal that for the plausible values of the corresponding intertemporal elasticity of substitution, government and private consumption are best described as Edgeworth complements in European economies. The results imply that an increase in government consumption increases the marginal utility of private consumption, which offsets the negative wealth effect induced by financing the increase in public consumption. Stated differently, an increase in government consumption can induce Keynesian effects through a positive marginal utility of private consumption. In contrast, fiscal consolidation that cut government consumption can adversely impact output through a negative marginal utility of private consumption. This last result is similar to finding in (Barrell et al. 2013).

Policy-wise, weaker economic growth in European economies compounded with the economic impact of the ongoing COVID-19 pandemic calls for a strong need for fiscal stimulus in these economies. These findings suggest that fiscal expansions that increase government consumption can stimulate aggregate demand via a marginal utility channel of private consumption. The results further reinforce recent arguments that fiscal stimulus packages that comprise large government consumption components may be effective at stimulating aggregate demand (see, Boehm 2019, for example).