1 Introduction

The impact of economic activity on the environment is attracting increasing attention from policy makers, firms and the academia. This is partly due to the increasing evidence of the negative impact that human activity has on the environment. The aim of this study is to construct environmental performance index that explicitly accounts for only good output (output proxy with GDP) and bad output (CO2 emission), and to assess if environmental performance converges globally and within sub-groups of countries. Environmental performance index in this study is defined as the amount of good output per unit of bad output for a given set of inputs and level of technology.

Convergence in global environmental performance has the following implication; first, it has the implication for burden-sharing among countries regarding carbon emission reduction, technological transfer and support in achieving global environmental commitment. Furthermore, convergence in global environmental performance could influence the negotiations of global climate agreements. It will also provide the basis for a fairer climate-negotiating framework, where relatively poor environmental performers will take a greater share of the climate agreement commitment relative to good environmental performers, which in the process may result in many countries joining the agreement.

The literature on emission is large and extensive that includes studies focused on the dynamic properties of emissions, how they change over time as a result of changes in income, income distribution, technical development etc., in the environmental Kuznets curve type of studies (e.g., Levinson 2002; Dinda 2004; Stern 2004; Lieb 2004; Kijima et al. 2010; Tsurumi and Managi 2010; Sugiawan and Managi 2016).

A recent strand in the literature is related to finding an appropriate measure for environmental performance (EP), as an assessment tool for economic and environmental policy, and to assess if EP converges across countries and over time for a given country. Whether there is convergence or not in EP across countries has implications for future global emissions negotiations, such as the Kyoto protocol in terms of quota allocations and also on regional level negotiations. Therefore, having an appropriate measure of EP and consequently providing evidence for convergence/divergence is extremely important. The majority of the studies in this strand of the literature measures EP either using CO2 emissions as a ratio of population, CO2 emissions as a ratio of GDP or energy consumption. Recent papers in this line of research includes: Strazicich and List (2003), Ngugen (2005), Aldy (2006, 2007), Ezcerra (2007), Panopoulou and Pantelidis (2009) Camarero et al. (2008), Westerlund and Basher (2008), Brock and Taylor (2010), Camarero et al. (2013), and Brännlund et.al (2015). These studies implemented various econometric methods from both parametric and non-parametric approaches to address the question of convergence in environmental performance measured simply as per capita CO2 emissions, CO2 emissions as a ratio of GDP or energy consumption. However, according to Ramanathan (2002), these studies only provide a partial picture of environmental performance, as they only consider the emissions originating from economic activities.

There are other existing approaches in measuring environmental performance, as an alternative to these measures. These other approaches in measuring EP are diverse, but can be grouped into three main perspectives: (1) the product cycle analysis/assessment, (2) the environmental accounting perspective, and (3) the production theory framework. Each of these approaches focus on different aspects of EP, albeit with different strengths and weaknesses, which are well-documented in Tyteca (1996) and Olsthoorn et al. (2001).

Our focus in this paper is to propose to use a simple but relevant theory for measuring EP, that is based on production theory, inspired by recent work that includes: Färe et al. (2006), Zaim and Taskin (2000), Zofio and Prieto (2001), Zaim (2004), Zhou et al. (2006) and Picazo-Tadeo and García-Reche (2007), and further test for conditional beta convergenceFootnote 1 in environmental performance. Therefore, our paper makes three key contributions to the literature.

First, we propose to use a simple theory in measuring EP that explicitly considers that the production process simultaneously results in both good and bad outputs. This framework is based on Färe et al. (2006). Second, based on this, we calculate, in the first step, an EP index for each country for a sample of 94 countries for the period (1971–2008). In the second step, we use the EP index in a regression analysis to test for β-convergence in environmental performance. As far as we know, we are among the first to do this based on a production theory framework at the global level. The only papers in the literature that studied convergence based on the production theory framework are Camarero et al. (2008) and Brännlund et al. (2015), but unlike our paper, Camarero et al. focus only on a sample of OECD countries, while Brännlund et al. focus on Swedish industries. Hence our work can therefore be seen as an extension of Camarero et al. (2008) on two fronts; it extends to a global sample, and it looks specifically at it from a β-convergence perspective as done in Brännlund et al. (2015) for Swedish industries.

Third, we also consider heterogeneity in the rate of convergence, both between groups of countries, in line with “club” convergence, and also between countries. This is important for both regional and international negotiations regarding burden-sharing in global environmental emissions, especially CO2 emissions. It is important to know the contributions to growth in CO2 emissions by regions and by countries to aid the negotiations in burden-sharing of global CO2 emissions. We also use econometric techniques that accommodate both country and time-specific effects that help to reduce spillover effects (cross-sectional dependence) from common shocks, and therefore reduce the possible bias that spillover effects can generate on the estimated parameters and also reduce possible cofounding effects.

The rest of the paper is organized as follows. Section two provides the theory and method for measuring the EP index, and section three provides details on the empirical approach for the study. The data is presented in section four, and the results are presented in section five. Section six, finally, contains some concluding comments and policy implications from the study.

2 Theory and method

The theoretical approach outlined here follows primarily Färe et al. (2006). The theory as such is not novel, and the theoretical presentation here is motivated mainly to lay the foundation to the performance index that will be used subsequently in the empirical analysis.Footnote 2

In particular, the environmental performance index, EP, which will be used here, is based on the neoclassical production theory, which explicitly recognizes that the production process results in both good and bad outputs. More specifically, this means that we will use a quantity approach based on ratios of output distance functions. In this particular case with one good (GDP) and one bad (CO2) output, it turns out that this ratio of distance functions results in a simple expression showing the growth path of the inverse of the emission intensity. The inverse in the growth in emission intensity is a consequence of the underlying production theory framework, where the output of interest is good output, but in producing good output, bad output is also provided. This is the rationale for dividing the good output index by the bad output index, an approach that credit good output per bad output. This is the standard convention in the literature (e.g., Färe et al. (2006).

The distance functions here are defined on the output possibility set, P(x), expressed as, \( P(x) = \left\{ {(g,b):x\,{\text{can}}\,{\text{produce}}\,(g,b)} \right\} \). Here g is good output, b is bad output, and x a vector of inputs. In general, g and b are vectors. P(x) is assumed to be convex, closed, and bounded, i.e., compact, with inputs and good outputs being freely disposable. Good outputs being freely disposable is formally expressed as, \( (g,b) \in P(X)\,\quad {\text{and}}\,\quad g' \le g\,\quad {\text{then}}\quad \,(g',b) \in P(x)(g,b) \), which means that a good output can always be reduced without reducing any other output.

In addition to these technological properties, good and bad outputs are assumed to be weakly disposable and null-joint, i.e., good output cannot be produced without producing any bad output.

Letting x o, b o, and g o be given reference levels of inputs, bad and good outputs, respectively. Then, given the technological properties above, a good and bad output quantity index can be specified. In the case with only one good and one bad output, the quantity indexes become,

$$ Q_{g}^{t} (g^{t + 1} ,g^{t} ) = \frac{{D_{y}^{t} \left( {x^{0} ,1,b^{0} } \right)\,.\,g^{t + 1} }}{{D_{y}^{t} \left( {x^{0} ,1,b^{0} } \right) \, .\, g^{t} }} = \frac{{g^{t + 1} }}{{g^{t} }} $$
(1)
$$ Q_{b}^{t} (b^{t + 1} ,b^{t} ) = \frac{{D_{b}^{t} \left( {x^{0} ,g^{0} ,1} \right)\,.\,b^{t + 1} }}{{D_{b}^{t} \left( {x^{0} ,g^{0} ,1} \right)\,.\,b^{t} }} = \frac{{b^{t + 1} }}{{b^{t} }} $$
(2)

Equation (1), the good output index, reflects the change in good output from period t to period t + 1, holding inputs and bad output constant, while the index in (2) reflects the change in bad output, holding inputs and the good output constant.

The index for environmental performance, EP, is then,

$$ {\text{EP}}^{t,t + 1} \left( {g^{t + 1} ,g^{t} ,b^{t + 1} ,b^{t} } \right) = \frac{{Q_{g}^{t} (g^{t + 1} ,g^{t} )}}{{Q_{b}^{t} (b^{t + 1} ,b^{t} )}} = \frac{{g^{t + 1} /g^{t} }}{{b^{t + 1} /b^{t} }} = \frac{{g^{t + 1} /b^{t + 1} }}{{g^{t} /b^{t} }} $$
(3)

If production of the good (bad) output increases between the time periods t and t+1, everything else is constant, it will influence EPt,t+1 positively (negatively).

From (3), it is also clear that EP is the growth rate (plus one) of the inverse of the emission intensity index. That is, if we define the inverse of emission intensity as I t = g t/b t, then,

$$ I^{t} = {\text{EP}}^{t - 1,t} .\,I^{t - 1} $$
(4)

Dividing both sides of (4) with I 0 gives the environmental performance between time period 0 and t. From Eq. (3), it is clear that EP can be decomposed, into two components. For instance, if a country’s EP improves, it can be investigated whether it is mainly due to an increase in good output or mainly due to a reduction in bad output, or due to a balanced combination of the two.

The fundamentals for our analysis are Eqs. (3) and (4). Given data on good and bad output, we can calculate EP and I to be used in the second step, the convergence analysis. The convergence analysis is based on the theoretical prediction of the neoclassical growth model developed by Ordás Criado et.al. (2011). Our reduced-form model, as presented below, can be interpreted as predicting β-convergence in environmental performance, but conditional on the input combination to produce bad and good output growth.

3 Empirical approach

The empirical analysis is performed in two steps. First, we specify and calculate the EP index at country level. In the second step, we specify a typical “growth equation” with EP as the dependent variable and I as the independent variable together with other control variables, such as capital and fossil fuel use. The choice of controls is based on production theory, where each production process depends on a given technology that combines effective capital, energy and labor services to produce a unit of output and in the process of converting the inputs (effective capital and energy) also generate pollution. This production process at the firm level can be aggregated to the macro-level to provide aggregate output and emissions levels that are generated from input combination via a given level of technology. We, therefore, assume environmental performance to dependFootnote 3 on the inverse of the initial emission intensity, capital and fossil fuel use, where fossil fuels is used as a proxy for energy use, since most of the emissions is generated from the use of fossil fuels.

The main focus of this study is to examine the convergence hypothesis, specifically, convergence in EP (growth in the inverse of CO2 intensity) by analyzing a cross-country data that spans a period of over 30 years. Three main approaches are commonly applied to analyses, the convergence hypothesis both in the environmental and economic growth literature. In this study, however, we focus on the β-convergence approach in a panel data framework. Formally, we specify the basic empirical model as,

$$ \begin{aligned} {\text{In}}\,{\text{EP}}_{i,t} &= \alpha_{i} + \phi_{t} + \beta \,{\text{In}}\,I_{i,t - 1} + \gamma_{1} {\text{In}}\,K_{i,t} + \gamma_{1} {\text{In}}\,FF_{i,t} + \varepsilon_{i,t} , \hfill \\ &\quad i = 1, \ldots N = {\text{countries}},\,\,\,\,\,t = 1, \ldots T = {\text{time}}\,{\text{periods}} \hfill \\ \end{aligned} $$
(5)

where β is the convergence parameter (the parameter of our central interest),\( I_{t - 1} \) is the initial level of the inverse in emission intensity, ln K i,t is capital in logs and ln FF i,t is fossil fuel use in logs.Footnote 4 Our interest is to test the hypothesis that β < 0, which, if the hypothesis holds, implies the existence of the so-called β-convergence. This means that countries with relatively low level of the inverse of initial emissions intensity (high level of emission intensity) tends to grow faster in terms of environmental performance, and hence tend to catch up with countries that start at a higher initial level of inverse of emission intensity. If this is the case, it may be because the low performance countries can benefit from the experience and technologies developed and used by the high performers. We also include time-specific effects (ϕ t ) as well as country-specific effects (α i ) to account for unobservables that are time-specific and country-specific, respectively. Thus, if β < 0, α i  = α for all i, and γ 1 = γ 2 = 0, then we have what has been called absolute convergence, i.e. all countries converges to the same inverse emission intensity level. If the above does not hold true, apart from β < 0, we have what has been called conditional convergence, which means that the growth paths differs, and hence do not converge to the same emission intensity level.

Further, to allow for more flexibility and to account for heterogeneity in convergence, a more general specification of Eq. (5) is proposed that can be expressed as,

$$ {\text{In}}\,EP_{i,t} = \widetilde{\alpha }_{i} + \phi_{t} + \widetilde{\beta \,}{\text{In}}\,I_{i,t - 1} + \widetilde{\gamma }_{1} \,{\text{In}}\,K_{i,t} + \widetilde{\gamma }_{2} \,{\text{In}}\,{\text{FF}}_{i,t} + \gamma_{3} \,{\text{In}}\,I_{i,t - 1} \cdot {\text{In}}\,k_{i,t} + \gamma_{4\,\,} \,{\text{In}}\,I_{i,t - 1} \cdot {\text{In}}\,{\text{FF}}_{i,t} + \gamma_{5} \,{\text{In}}\,K_{i,t} \cdot {\text{In}}\,{\text{FF}}_{i,t} + \widetilde{\varepsilon }_{it} $$
(6)

The convergence parameter, corresponding to β in Eq. (5), is then found by taking the partial derivative of Eq. (6) with respect to the inverse of the initial emission intensity. Correspondingly, the effect on EP from the level of fossil fuel use is the derivative with respect to fossil fuel use:

$$ \begin{aligned} \frac{{\partial \,{\text{In}}\,{\text{EP}}_{it} }}{{\partial I_{i,t - 1} }} = \widetilde{\beta } + \gamma_{3} \,{\text{In}}\,K_{it} + \gamma_{4} \,{\text{In}}\,{\text{FF}}_{i,t} \hfill \\ \frac{{\partial \,{\text{In}}\,{\text{EP}}_{it} }}{{\partial \,{\text{In}}\,{\text{FF}}_{it} }} = \widetilde{\gamma }_{2} + \gamma_{4} \,{\text{In}}\,I_{i,t - 1} + \gamma_{5} \,{\text{In}}\,K_{it} \hfill \\ \end{aligned} $$

As a result, using Eq. (6) allows the rate of convergence to vary between countries due to differences in capital intensity and fossil fuel use.

The specifications, as presented in Eqs. (5) and (6), are estimated using a fixed effects model (time and country) due to the following reason. First, we do not think that country-specific unobservables, such as norms or culture, are uncorrelated with the explanatory variables such as the inverse of emissions intensity or capital. It is most likely that norms and culture influence how policies are adopted to influence capital accumulation and its association with labor and the combination of this on output-emissions relations. Second, given the long time period, times series properties cannot be ignored. We assess this via various diagnostic tests on the residual to check for stationarity and also check for cross-sectional dependence. We also include time fixed effects, that can capture common factors that are time-specific, since it otherwise could lead to spillover effects (cross-sectional dependence), which may lead to biased estimates in either direction. It is important to clarify that the time fixed effects cannot account for all forms of common factor effects, but only for time-specific common factors. Moreover, the long time dimension makes the fixed effect model not suffer from potential panel data bias (Nickel bias) due to the inclusion of the lag inverse emission intensity as one of the regressors in the model as proven by Judson and Owen (1999).

4 Data

The data used for the analysis is a panel data set covering 1971 to 2008 for 94 countries (list of the countries is presented in Table 5 in the appendix). The reasons for limiting it to the period 1971 to 2008 are two. First, the lower boundary of the period is due to the lack of data on one of the key variables (CO2) for most of the countries for years below 1971, but since we want to cover as many countries as possible, we have rely on periods at and beyond 1971. The second reason is in respect to the 2008 upper limit of the data coverage. We limited to 2008, because we did not want to have the influence of the 2008 financial crisis, which ultimately resulted in a slow down of the world’s economy and its effects on CO2 emissions via reduction in world energy demand. Since the potential CO2 effects during that period might be due to the slowdown in the world economy initiated by the financial crisis rather efficient and performance-related measures. To avoid these cofounding effects, we restricted our analysis to 2008. The number of countries for the analysis is also limited to 94 due to lack of data for some of the variables for some of the countries for the period understudy.

The variables in the dataset include data on CO2 emissions in kilo tons, gross domestic product (GDP) in constant 2000 US dollars, gross fixed capital as a ratio of GDP, total fossil fuel consumption as a ratio of total energy use. All variables are taken from the World Development Indicator (WDI) database of the World Bank. The CO2 data together with the GDP are used to calculate, or construct the environmental performance index.

The CO2 data were originally gathered and developed by Marland et al. (1989), and were constructed as estimates of CO2 emissions from fossil burning and manufacturing of cement. This method was adopted by the Carbon Dioxide Information Analysis Center of the US Department of Energy in the construction of the recent CO2 series, which the World Bank sources from. One shortcoming of this data series is that it omits carbon dioxide emissions stemming from deforestation, land-use changes, and the burning of wood fuel. But irrespective of this, we think this variable is reliable, and most of all consistent across the world and can reasonably approximate global CO2 emissions, despite an error of uncertainty of 6–10% (Strazicich and List 2003), but this error can be higher than 10% at individual country level.

The data on capital is constructed based on capital investments that include plant, machinery and equipment purchases, land development, rail and road constructions, buildings, and net acquisition of valuables. The capital variable is expressed as a ratio of GDP. It is possible that some of the variables included in the construction of the capital stock might not directly require energy services in their use, but might do so in their production and therefore their contribution to CO2 emissions might be due to both processes (consumption and production).

Fossil fuel data comprises of coal, oil, petroleum, and natural gas products, expressed as a ratio of total energy use. Total energy use data is constructed from primary energy use that includes indigenous production plus imports and stock changes, and excludes exports and fuels supplied to ships and aircraft engaged in international transport. It is expressed in kilo tons of oil equivalents.

Summary statistics for the variables used for the analysis, which is presented in Table 1, reveals a large variation between countries for all the variables. The variability of each of the variables is shown by the standard deviation with large values indicating more variability.

Table 1 Summary statistics for 94 countries, 1971–2008

5 Results

Given the construction, or calculation of the EP and the inverse of the emission intensity index according to Eqs. (5) and (6), the empirical strategy in our analysis follow a two-step procedure: first, we will estimate Eq. (5) using the whole panel to examine the β-convergence hypothesis, while allowing for heterogeneity in the convergence parameter via interaction terms between previous I, K, and FF (inverse in emission intensity, capital and share of fossil fuel use); second, we divide the sample into three groups of countries (low-income, middle-income and high-income) based on per capita income levels consistent with the World Bank classification. The idea is to assess if the test for β-convergence is sample-dependent, and to examine possible heterogeneity in β-convergence in line with the so-called “club” convergence, by testing if countries with different per capita income levels converge differently or otherwise.

The results presented in Table 2 is based on the global sample, where we first test for unconditional β-convergence by estimating Eqs. (5) and (6) with the restriction that \( \alpha_{i} = \alpha \) for all i and γ = 0. The results for the unconditional β-convergence are presented in column (1) in Table 2. The estimated coefficient of the inverse in the initial emission intensity (the β-convergence) parameter is negative and significant at the 5% level. The implication of this result is that, countries that starts at a lower level of the inverse in the initial emission intensity tends to grow faster in their environmental performance than countries that start at higher level, and that this difference only depends on the initial level of the inverse in emission intensity and nothing else. Intuitively, this unconditional β-convergence hypothesis appears very restrictive in the sense that we think that other important factors also influence the differences in the rate of EP. For instance, differences in the amount of fossil fuel usage in the production process should have consequences on EP for countries (even if we control for the existing technology).

Table 2 Panel estimates for conditional and unconditional β-convergence (global sample)

Next we proceed to test the restrictions imposed in estimating the unconditional β-convergence model by relaxing the restriction that \( \alpha_{i} = \alpha \) for all i and γ = 0. The results are reported in columns (2) to (7), with column (2) allowing for country-specific constants with additional regressors (capital and fossil fuel use). Columns (3)–(6) are models that allow for various combinations of interactions between the initial inverse in emission intensity, capital and fossil fuel, while column (7) reports the results for a model that adds interaction between fossil fuel use and capital to the model in column (3). The test that \( \alpha_{i} = \alpha \) is rejected for each of the models presented in column (2) to (7), also the test that γ = 0 is rejected at the 5% level of significance, implying that the unconditional β-convergence is not appropriate in exploring the β-convergence hypothesis for this study. Henceforth, we will, therefore, focus on the conditional β-convergence hypothesis. However to discriminate between the conditional β-convergence models, we apply cross-validation (CV) criteria to the models presented in columns (2)–(7) and base our conclusions on the model with the smallest CV value as well as correcting for cross-section dependence. The CV values show little differences, but indicate that the model presented in column 7 has the smallest CV value and also that the errors do not suffer from cross-sectional dependence at the 5% significance level, hence we focus our interpretation on the results presented in column 7 in Table 2.

The results from the preferred model indicate evidence of conditional β-convergence in environmental convergence, as the coefficient of the inverse of the initial emission intensity is significantly negative. Further, the results indicates that the convergence parameter does not vary with the level of capital for the global sample, since the coefficient of the interaction term between the inverse of the initial emission intensity and that of capital is insignificant. The capital elasticity is − 0.3, which is significant at the 5% level and implies that capital, on average, tend to reduce environmental performance. The implication of this is that the carbon/ energy conservation investments that have been undertaken are not large enough to offset the additional amount of carbon that is needed for having more capital, and as a consequence the energy requirements outweigh the energy-saving potentials from these investments. This could also mean that the net effect of high trade intensity in the world that tend to generate positive effect on emissions from developing countries outweigh the negative effect of trade on emission from developed countries as found in Managi et al. 2009.

The estimated effect on EP from the fossil fuel share, on the other hand, depends on the capital stock since the interaction effect is statistically significant. The fossil fuel share elasticity is then calculated by taking the partial derivative of EP with respect to the fossil fuel share, which then varies over time and across country, due to variation in the capital stock over time and between countries. Figure 1 presents the graph for the average global fossil fuel share elasticity, and it shows variation from − 0.21 to − 0.02 for the period under study. The implication of this is that, EP appears to be less responsive to changes in the global average fossil fuel share in 2008 relative to say 1971. One possible interpretation of this is that, even if the share of fossil fuels increases on average, its effect on EP has become smaller due to improvements in energy efficiency/productivity. That is, more good output (y) for the same amount of bad output (b).

Fig. 1
figure 1

Estimated global average fossil fuel elasticity for the period 1971–2008

The time series properties for the preferred model are also analyzed. The reason is that given the long time period dimension, we cannot ignore time series properties such as unit roots in the model’s residuals, cross-sectional dependence (spillover effects), and serial correlation. The preferred model, however, turns out to perform well on all the time series diagnostics. We applied three different panel unit root tests (Maddala and Wu 1999, Im et al. 2003, and Pesaran 2007) on the model residuals with and without trend. The results from these tests consistently reject the null of unit root, implying that the model residuals are I(0), which is consistent with the concept of cointegration and non-spurious regression. Further, we applied the CD-test proposed by Pesaran (2004) for testing cross-sectional dependence. The test result from this could not reject the null hypothesis of cross-sectional independence of the residuals at the 5% significant level, implying no cross-sectional dependence in the preferred model. Additionally, the model is free from second-order serial correlation, as the P value of the test statistic indicates that the null hypothesis of no second-order serial correlation can be rejected at the 5% significance level.

5.1 Heterogeneity based on sub-samples

For the purpose of addressing possible heterogeneity, based on per capita income levels, we estimated Eqs. (5) and (6) on three sub-samples (low-income, medium- and high-income) using the preferred model. The estimated results are reported in Table 3, with the heading of each column indicating each of the sub-samples used. The estimated convergence parameter in each of the sub-samples is negative and significant at the 5% level, implying evidence of conditional convergence. However, a Chi-square test to check if the differences in the estimated convergence parameter are significant reveals that both the low- and middle-income samples are significantly different from the high-income sample, but the difference between the low- and middle-income samples is not significant.

Table 3 Panel estimates for conditional β-convergence (sub-samples based on income level)

Moreover, we find that the interaction term between capital and the inverse of initial emission intensity to be significant only for the high-income sample. The results are thus revealing heterogeneity at the country level within the high-income group (since the partial derivative of EP with respect to the inverse of initial emission intensity depends on the level of capital). The estimated rate of convergence for low-income and middle-income groups are − 0.40 and − 0.35, respectively. In the case of the high-income group, the estimated rate of convergence is given by, \( \partial \,{\text{In}}\,{\text{EP}}/\partial \,{\text{In}}\,I = - 0.873 + 0.244 \cdot \,{\text{In}}\,K_{i,t} \), which varies with time and over country, since capital varies across country and time. Evaluated at the mean for lnK, the convergence rate becomes − 0.12, which indicates a slower convergence rate on average for the richest countries. By averaging over countries Fig. 2 presents the average variation over time for the rate of convergence in the high-income sample. Figure 2 reveals small variability over time, ranging from − 0.14 to − 0.10. However, if we instead calculate the variation over countries, by taking averages over time, the variation increases and ranges from − 0.17 to − 0. 04 (results are presented in Table 6 in the appendix), implying that differences in convergence is slightly higher between countries than over time for the high-income countries.

Fig. 2
figure 2

Graph for β-convergence for high-income countries

The results further show that the capital share does not have a significant effect on the environmental performance for the low-income countries. This might be explained by both the size and type of capital used by this group of countries and by the poor/low environmental regulatory enforcement in low-income countries, implying among other things low abatement cost and consequently low capital investment to ensure good environmental performance. It is likely that a greater share of the capital stock in low-income countries do not require much energy services, partly due to geography and partly due to affordability. For instance, buildings in most low-income countries do not have electric cookers, washing machines, heating/air conditioners, which imply less energy use per house relative to the high-income countries. Most of the industrial activities are rather small-scale with relatively small capital stock relative to large-scale industries in the developed countries with large capital stock, implying less energy intensive industrial activities in the low-income countries relative to the high-income countries and hence less energy use and carbon emission (here we are referring to the size, not the efficiency level of capital), which is consistent with the finding in Managi et al. 2009 for developing countries. They attribute positive effects of capital-labor and regulation on emissions to high labor intensive production processes and less stringent environmental regulation procedures of the developing countries. The estimates based on the middle- and high-income samples indicate, on the other hand, a significant effect of capital, which varies with the inverse of the initial emission intensity and fossil fuel use for the high-income group, while it varies only with fossil fuel use for the middle-income group. The graph for the average capital elasticity for both the middle- and high-income group is presented in Fig. 4 in the appendix, and shows variation over time as expected.

We also find that the fossil fuel elasticity vary over the level of capital for both the middle- and high-income samples, consistent with the global model. However, in the case of the low-income sample, we find no evidence of variation over the level of capital, since the interaction term for fossil fuel use and that of capital is insignificant at the 5% level. The fossil fuel elasticity for the low-income sample is − 0.24, which implies fossil fuel usage tend to have a negative impact on environmental performance. We also calculated the average middle- and high-income fossil fuel elasticity and these are presented in Fig. 3. As can be seen, the elasticity is lower on average, and varies slightly over time relative to the elasticity for the low-income countries. One possible explanation to the difference in fossil fuel share elasticity between low- and high-income countries is that high-income countries have a more carbon efficient technology relative to those in low-income countries, i.e., more good output can be produced for the same amount of bad output.

Fig. 3
figure 3

Estimated average fossil fuel elasticity for middle- and high-income samples

Additionally, we also examine the effect of the Kyoto protocol on EP by including a dummy for the Kyoto implementation period in Eq. (6). The results are reported in Table 7 in the appendix and indicate evidence of a significant positive effect of Kyoto on EP for both the global sample and each of the sub-samples. However, the results based on the sub-sample reveal that the shifts are higher for the low- and middle-income countries in comparison to the high-income countries, irrespective of no specific binding commitments for the low-income countries. As a robustness check on the construction of the Kyoto dummy, we also used a different dummy that considers both the implementation and signing period for the protocol (the signing of the Kyoto protocol was in 1997, while its implementation takes effects in 2005). We did not, however, find any significant differences in the results, hence we decided to only report the results based only on the implementation period.

From the results above, we can draw a number of overall conclusions. First, there is evidence of conditional convergence in environmental performance (EP) on the global scale. Second, in general, there is evidence of heterogeneity in convergence in EP, both between groups of countries, based on income level, and between countries within the high-income group. Third, both the share of capital and the share of fossil fuel use tend to have a negative effect on environmental performance, particularly for the middle- and high-income countries. To achieve better EP, more effort seems to be needed in both reducing the share of fossil fuels and to increase the use of more energy-efficient capital in both production and consumption processes in the economy.

A potential important corollary from the results above is that the abatement costs, due to a relative fast transition to a lower global emission path, may become very high and also vary substantially between countries depending on the capital intensity and the current dependency on fossil fuels. The reason for this is that, if there is a hurry to stabilize the CO2 concentration level, and an urgency to quickly reduce emissions, then capital has to be replaced at a faster rate than business as usual. As a result, overall abatement costs will increase, and in the presence of large variations in convergence patterns, multi-lateral agreements may be more difficult to achieve. These kinds of difficulties may increase as more countries, especially in Asia, is on a path of fast growth, implying among other things, an increase in capital intensity.

5.2 Sensitivity analysis

To assess how sensitive our finding is not explicitly controlling for difference in environmental stringency for the various countries in our sample, especially the high-income group, where this is likely to differ greatly between countries. For instance, the Nordic countries have more stringent environmental rules and laws relative to some of the continental European countries such as Italy, Greece, Portugal and some non-European countries such as USA, Chile, South Korea, Kuwait, and therefore, not controlling for this might bias our results. To address this concern, we use two proxy variables, rule of law and regulatory quality index to capture the effect of environmental stringency. The two proxy variables only start from 1996 in our data set and this reduced our sampled period for this robustness check.

The estimates based on controlling for the environmental stringency is reported in Table 8 in the appendix. We only reported the estimates based on the high-income countries just to save space, but the results based on the global, low-income and middle-income countries are available on request. We find slight differences in the magnitude for the convergence estimates based on both the rule of law and regulatory proxies as reported in Table 4, but the differences between the baseline estimates and those based on controlling for environmental stringency are insignificant.

Table 4 Distribution of beta convergence for the high-income countries with and without controlling for environmental stringency

The estimated convergence parameter (taking average capital over time for the high-income countries) based on rule of law ranges from − 0.25 to − 0.20, while that based on regulatory quality ranges from − 0.21 to − 0.16. These values compared to our baseline values of − 0.14 to − 0.10 are not greatly different in terms of the general picture that the convergence rate for the high-income countries is small. A similar picture is revealed, when we instead take the average capital over countries for the high-income group as shown in Table 4.

6 Conclusion

Three key issues are addressed in this study. First, we provide a simple framework in the construction of environmental performance (EP) based on production theory. The theory provides an easy procedure in constructing an index that does not require much data in terms of the relevant variables to construct the index. Second, we address the question of convergence in EP at the global level, and third, we address the question of possible heterogeneity in EP, both between groups of countries (in line with the so-called “club convergence”) and between countries.

Our findings can be summarized as follows. First, the simple theory shows that the EP index we construct is simply the rate of change in the ratio of the inverse in emission intensity (emission intensity is defined as the ratio of CO2 emission over GDP). This is because the production process is assumed to result in two outputs, a good output (GDP, in this case) and a bad output (CO2). Since our interest is in EP in relation to CO2 emissions, the construction of the index via this approach is appropriate, given that the GDP variable captures most of the good output in the economy and that the CO2 variable captures most of the CO2 emissions in producing the good output. The emphasis here is on CO2-related EP, not EP in general.

Second, based on the constructed EP index, we tested both the unconditional and conditional β-convergence hypothesis at the global level. Here we find strong evidence in support of conditional β-convergence in EP in the global sample. This finding is consistent with the finding in Brännlund et al. (2015), which finds evidence of convergence in EP for Swedish industries and Camarero et al. (2008) based on 22 OECD countries. Further, the results here indicate that the rate of convergence does not vary with capital and fossil fuel, when we employ the whole global sample. The results also show a significant negative effect of capital and the use of fossil fuel on EP.

Moreover, the results reveal heterogeneity in conditional β-convergence, when the data was divided into three sub-samples based on per capita income (low-, middle-, and high-income samples). It further reveals that not only do the rates of convergence vary between groups of countries, but also vary between countries within the high-income sample. However, we find no evidence of differences in convergence between countries within the low-income sample. This means that differences in the rate of convergence tend to be more pronounced between countries within the high-income group, likely due to the differences in the share of fossil fuels and the possible effects of trade on the impact of difference in capital intensive production process and stringency in environmental regulation as found in Managi et al. 2009.

Furthermore, our findings show that the rate of convergence for the high-income sample varies with the share of the capital stock, with high capital intensity countries having a low rate of convergence to the steady state relative to countries with low capital intensity. This result is new, compared to the findings from previous research in this area. This heterogeneity between countries, and especially the slow convergence rate among countries with abundant capital, may cause severe problems in negotiations over burden-sharing. On one hand, low-income/capital countries will argue that high-income/capital countries have a responsibility to reduce emissions more. On the other hand, this would imply high abatement costs for those countries, since capital has to be scrapped at a much faster rate than the rather slow process in business as usual.

Lastly, the level of the fossil fuel share has a negative effect on environmental performance for each of the samples, but it tends to vary with capital in the case of the middle- and high-income countries. Interestingly, we also find a significant effect of the Kyoto protocol on EP, which reflects significant positive shifts for the periods that the Kyoto protocol came into effect relative to other years.

These findings suggest that we need policies that promote both energy efficiency and conservation to improve EP, since both fossil fuel use and capital inversely impact EP. Additionally, since every production process results in a good output (GDP) as well as a bad output (CO2), and since most of the CO2 is generated from the use of fossil fuel, it is important that economic stimulus policies are well-balanced with energy conservation policies to promote growth without compromising environmental performance. Finally, since there seems to be significant differences concerning the growth path for different groups of countries, as well as differences in the steady state, a global agreement on burden-sharing, concerning CO2 emissions, has to take these differences into account.

Our findings further suggest that investments in energy-conservative measures, especially capital investments tend to have positive impact on EP after the year 2000 for the high-income countries, implying that conservative measures in terms of capital investment are yielding positive results in terms of producing more good output with less bad output. We also see declining capital elasticity over the study period for the middle-income countries. This implies among other things that both middle- and high-income countries are able to produce more output with less carbon emission in the later years, compared to the earlier years, which translates to improvement in environmental performance across these countries. The findings on the effect of capital intensity in addition to the positive effect of Kyoto protocol complement the findings in Zhou et al. (2010), irrespective of the differences in data set, time period, focus of the study and methodology.